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CANDIDA ALBICANS PROTEINS ASSOCIATED WITH VIRULENCE AND 
HYPHAL FORHATION AND USES THEREOF 

BACKGROUND OF THE INVENTION 

5 (a) Field of the Invention 

The invention relates to Candida albicans pro- 
teins, such as CaCla4p, Cst20p, CaCdc42p and CaBemlp, 
associated with virulence and hyphal formation and uses 
thereof, such as to design screening tests for inhibi- 

10 tors for the treatment of pathogenic fungi infections 
and/or inflammation conditions . 
(t>) Description of Prior Art 

Candida albicans is the major fungal pathogen in 
humans, causing various forms of candidiasis. The 

15 incidence of infections is increasing in immunocom- 
promised patients . This fungus is diploid with no sex- 
ual cycle and is capable of a morphological transition 
from a unicellular budding yeast to a filamentous form. 
Extensive filamentous growth leads to the formation of 

20 a mycelium displaying hyphae with branches and lateral 
buds . In view of the observation that hyphae seem to 
adhere to and invade host tissues more readily than 
does the yeast form, the switch from the yeast to the 
filamentous form probably contributes to the virulence 

25 of this organism (for a review see Fidel, P. L. & 
Sobel, J. D. (1994) Trends Microbiol. 2, 202-205). The 
molecular mechanisms by which morphological switching 
is regulated are poorly understood. 

Like C. albicans, bakers yeast Saccharomyces 

30 cerevisiae is also a dimorphic organism capable of 
switching under certain nutritional conditions from a 
budding yeast to a filamentous form. Under the control 
of nutritional signals, diploid cells switch to pseudo- 
hyphal growth (Gimeno, C. J. et al. (1992) Cell 68, 

35 1077-1090), and haploid cells to invasive growth 
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(Roberts, R. L. & Fink, G- R. (1994) Genes Dev. 8, 
2974-2985) . 

The similarities between the dimorphic switching 
of S. cerevisiae and C. albicans suggest that these 
5 morphological pathways may be regulated by similar 
mechanisms in both organisms. In 5. cerevisiae , mor- 
phological transitions are controlled by signaling com- 
ponents that are also involved in the mating response 
of haploid cells (Roberts, R. L. & Fink, G. R. (1994) 

10 Genes Dev. 8, 2974-2985; Liu, H. et ,al. (1993) Science 
262, 1741-1744). The switch to pseudohyphal growth 
requires a transcription factor encoded by the STE12 
gene, and a mitogen-activated protein (MAP) kinase cas- 
cade including Ste7p (a homolog of MAP kinase kinase or 

15 MEK), Stellp (a MEK kinase homolog) and Ste20p (a MEK 
kinase kinase) (Roberts, R. L. & Fink, G. R. (1994) 
Genes Dev. 8, 2974-2985; Liu, H. et al. (1993) Science 
262, 1741-1744). The MAP kinases involved in this 
response are as yet unknown (Roberts, R. L. & Fink, G. 

20 R. (1994) Genes Dev. 8, 2974-2985; Liu, H. et al. 
(1993) Science 262, 1741-1744). 

Members of the Ste20p family of serine/threonine 
protein kinases are thought to be involved in trigger- 
ing morphogenetic processes in response to external 

25 signals in organisms ranging from yeast to mammalian 
cells. Two of these kinases, Ste20p and Cla4p, are well 
characterized in S. cerevisiae (Leberer, E. et al. 
(1992) EMBO J. 11, 4815-4824; Cvrckova, F. et al. 
(1995) Genes Dev. 9, 1817-1830). Ste20p is required for 

30 pheromone signal transduction (Leberer, E. et al. 
(1992) EMBO J. 11, 4815-4824) and for filamentous 
growth in response to nitrogen starvation (Roberts, R. 
L. & Fink, G. R. (1994) Genes Dev. 8, 2974-2985; Liu, 
H. et al. (1993) Science 262, 1741-1744), and shares an 

35 essential function with Cla4p during budding (Cvrckova, 
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F. et al. (1995) Genes Dev. 9, 1817-1830). Ste20p and 
Cla4p interact with the small G-protein Cdc42p, and 
this interaction is required for viability of S. cere- 
visiae cells. Ste20p also interacts with the SH3 
5 domain protein Bemlp, and this interaction plays a role 
in morphogenetic processes (Leeuw, T. et al. (1995) 
Science 270, 1210-1213). 

Here we show that Cst20p, a C. albicans homolog 
of the Ste20p protein kinase, is required for hyphal 

10 growth of C. albicans under certain in vitro condi- 
tions. We also show in a mouse model for systemic can- 
didiasis that Cst20p plays a role in virulence, as 
judged from significantly prolonged survival of mice 
infected with CST20 deleted cells . Our results suggest 

15 that Cst20p acts in a regulatory pathway which is 
involved in hyphal growth of C. albicans. 

We also demonstrate that CaCla4p, a C. albicans 
homolog of the Cla4p protein kinase, is required for 
hyphal formation in vitro in response to serum, and in 

20 vivo in a mouse model for systemic candidiasis. We 
also show that CaCla4p is required for efficient colo- 
nization of kidneys with C. albicans cells after infec- 
tion of mice and essential for virulence in the mouse 
model . 

25 

SUMMARY OF THE INVENTION 

One aim of the present invention is to provide 
Candida albicans proteins, such as CaCla4p, Cst20p, 
CaCdc42p and CaBemlp, and their uses thereof. 
30 One aim of the present invention is to provide 

the nucleotide and amino acid sequences of CaCla4p, 
Cst20p, CaCdc42p and CaBemlp. 

Another aim of the present invention is to pro- 
vide screening tests for inhibitors of CaCla4p, Cst20p, 
35 CaCdc42p and CaBemlp or of their interactions. 
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The term "fungi" when used herein is intended to 
mean any fungi, pathogenic or not, which show hyphal 
induction using kinases, such as C. albicans, Saccharo- 
myces cerevisiae, Aspergillus, Ustilago maydis, and all 
5 the species of the fungal genera Aspergillus ^ Blastomy- 
ces, Candida, Cladosporium, Coccidioides, Cryptococcus, 
Epidenaophyton , Exophilia, Fonsecaea, Histoplasma, 
Madurella, Malassezia, Microsporum, Paracoccidioides, 
Penicillium, Phaeoannellomyces , Phialophora, Scedospo- 

10 rium, Sporothrix, Torulopsis, Trichophyton, Trichospo- 
ron, Ustilago, Wangiella, Xylohypha, among others. 

In accordance with the present invention there 
is provided an in vitro screening test for compounds to 
inhibit the biological activity of at least one protein 

15 selected from the group consisting of CaCla4p, Cst20p, 
Cdc42p and Bemlp, which comprises: 

a) at least one of the proteins; and 

b) means to monitor the biological activity of at 
least one protein; 

20 thereby compounds are tested for their inhibiting 
potential. 

In accordance with another embodiment of the 
present invention, the inhibition of the interactions 
between CaCla4p and CaCdc42p is determined. 
25 In accordance with another embodiment of the 

present invention, the inhibition of the interactions 
between Cst20p and CaCdc42p is determined. 

BRIEF DESCatlPTION OF THE DRAWINGS 

30 Figs. lA to ID illustrate photomicrographs which 

show that C. albicans CST20 gene complements defects in 
pseudohyphal growth of ste20/ste20 S. cerevisiae dip- 
loid cells. 

Figs. 2A to 2C show the morphology of S. cere- 
35 visiae MATa cells (strain YEL306-1A) deleted for STE20 
and CLA4, and transformed with plasmids expressing CLA4 
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(Fig. 2A), STE20 (Fig. 2B) and C. albicans CST20 (Fig. 
2C) . 

Figs. 3A to 3C show the nucleotide (SEQ ID NO: 5) 
and predicted amino acid sequences of CST20 (SEQ ID 
5 N0:6). 

Fig. 4A is the deletion of CST20 in C. albicans. 
Fig. 4B is the Southern blot analysis with a 
CST20 fragment from EcdRl to Xbal as a probe. 

Figs. 5A to 5J show colonies of C. albicans 

10 cells grown for 5 days at SVC on solid "Spider" medium 
containing mannitol. Wild type strain SC5314 (A), 
ura3/ura3 cst20A/cst20A: :URA3 strain CDH22 (B), 
uraS/uraS cst20A/cst20A: :CST20: :URA3 strain CDH36 
(obtained by reintegration of CST20 into strain CDH25 

15 by homologous recombination using linearized plasmid 
pDH190) (C), uraS/uraS cst20A/cst20A strain CDH25 
transformed with plasmids pYPBl-ADHpt (D) and pYPBl- 
ADHpt-HST7 (E), ura3/ura3 hst7A/hst7A strain CDH12 
transformed with plasmids pVEC (P), pVEC-HST7 (G) , 

20 pYPBl-ADHpt (H), and pYPBl-ADHpt-HSTV (I), and 
iira3/ura3 cphl/cphl strain CDH72 [ura3/ura3 derivative 
of strain JK19] transformed with pYPBl-ADHpt-HST7 (J). 
Photomicrographs of representative colonies were taken 
with a 2x lens (bar=2mm). 

25 Figs. 6A to 6C illustrate virulence assays. Sur- 

vival curves of mice (n=10 for each C. albicans strain 
at each inoculation dose) infected with 1 x 10^ (A) and 
1 X 10^ (B) cells of C. albicans strains SC5314 (wild 
type), CAI4 (ura3/ura3.) , CDH22 {ura3/ura3 cst20A/cst20A 

30 ::URA3) (C) Staining of mouse kidney sections with 
periodic acid Schiff's stain 48 hours after infection 
with cst20A/cst20A: :URA3 mutant strain CDH22 (a). Some 
hyphal cells are indicated with arrows (bar=0.1 mm). 
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Figs. 7A to 7B illustrate the nucleotide (SEQ ID 
NO: 7) and predicted amino acid (SEQ ID NO: 8) sequences 
of CaCLA4. 

Fig. 8A illustrates the deletion of CaCLA4 in C. 
5 albicans. 

Fig. SB illustrates the Southern blot analysis 
with the CaCLA4 fragment from PstI to Xbal as a probe. 

Fig. 8C illustrates the Northern blot analysis 
with the CaCLA4 fragment as a probe. PCR with the 
10 divergent oligodeoxynucleotides OEL109 and OELllO was 
used to delete the coding sequence of CaCLA4. A hisG- 
URA3-hisG cassette was then inserted, and homologous 
recombination was used in a two-step procedure to 
replace both CaCLA4 alleles. 
15 Fig. 9 illustrates virulence assays. Survival 

curves of mice (n=15 for each C. albicans strain) 
infected with 1 x 10^ cells of C. albicans strains 
SC5314 (wild-type), CDH77 (.CaCLA4/cacla4A) , CLJl 
(cacla4A/cacla4A) and CLJ5 (CaCla4A/cacla4A) trans- 
20 formed with the control plasmid pVEC and plasmid pVEC- 
CaCLA4 carrying the CaCLA4 gene. 

Fig. 10 illustrates the staining of mouse kidney 
sections with periodic acid Schiff's stain 48 h after 
infection with C. albicans strains SC5314 and CLJl. 
25 Fig. 11 illustrates the nucleotide (SEQ ID NO: 9) 

and predicted amino acid (SEQ ID NO: 10) sequences of 
CaCdc42p. 

Figs. 12A to 12B illustrate the nucleotide (SEQ 
ID NO: 11) and predicted amino acid (SEQ ID NO: 12) 
30 sequences of CaBemlp. 

DETAILED DESCRIPTION OF THE INVENTION 

The CST20 gene of Candida albicans was cloned by 
functional complementation of a deletion of the STE20 
35 gene in Saccharomyces cerevisiae. CST20 encodes a 
homolog of the Ste20p/p65^^ family of protein kinases. 
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Colonies of C. albicans cells deleted for CST20 
revealed defects in the lateral formation of mycelia on 
synthetic solid "Spider" media. However, hyphal devel- 
opment was not impaired in some other media. Cells 
5 deleted for CST20 were less virulent in a mouse model 
for systemic candidiasis. Our results suggest that 
more than one signaling pathway can trigger hyphal 
development in C. albicans, one of which has a protein 
kinase cascade that is analogous to the mating response 
10 pathway in S. cerevisiae and might have become adapted 
to the control of mycelial formation in asexual 
C. albicans. 

The CaCLA4 gene of C. albicans was cloned by 
functional complementation of the growth defect of S. 

15 cerevisiae cells deleted for the STE20 and CLA4 genes. 
CaCLA4 encodes a homolog of the Ste20p family of ser- 
ine/threonine protein kinases with pleckstrin homology 
and Cdc42p binding domains in the amino-terminal non- 
catalytic region. Deletion of both alleles of CaCLA4 in 

20 C. albicans caused defects in hyphal formation in vitro 
in synthetic liquid and solid media, and in vivo in a 
mouse model for systemic candidiasis. The deletions 
reduced the invasion of C. albicans cells into kidneys 
after infection into mice and completely suppressed 

25 virulence in the mouse model. Thus, hyphal formation of 
C. albicans mediated by the CaCla4p protein kinase may 
contribute to the pathogenicity of this dimorphic fun- 
gus . 

The CaBEMl and CaCDC42 genes of C. albicans were 
30 cloned by functional complementation of the growth 
defect of S. cerevisiae cells deleted for the BEMl and 
CDC42 genes, respectively. CaBEMl encodes an SH3 
domain protein with homology to Bemlp, and CaCDC42 
encodes a small G-protein with homology to members of 
35 the Rho-family of G-proteins. 
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MATERIALS AND METHODS 
Yeast manipulations 

The yeast form of C. albicans was cultured at 
5 BCC in YPD medium. Hyphal growth was induced at 37 "C 
on solid "Spider" media (Liu, H. et al. (1994) Science 
266, 1723-1726) containing 1% (w/v) nutrient broth, 
0.2% (w/v) K2HPO4, 2% (w/v) agar and 1% (w/v) of the 
indicated sugars (pH 7.2 after autoclaving) . Cells 

10 were grown in liquid "Spider" media at 30''C to station- 
ary phase, and then incubated for 5 days at 37 °C on 
solid "Spider" media at a density of about 200 cells 
per 80 mm plates. All media were supplemented with 
uridine (25 ng/ml) for the growth of lira" strains. 

15 Germ tube formation was induced at 37 °C in either 10% 
fetal bovine serum (GIBCO/BRL) on liquid "Spider" media 
containing the indicated sugars at an inoculation den- 
sity of 10^ cells per ml. 

Yeast manipulations were performed according to 

20 standard procedures. 
Isolation of CST20 

The CST20 gene was isolated from a genomic 
C. albicans library constructed in plasmid YEp352 from 
genomic DNA of the clinical isolate WOl (Boone, C. et 

25 al. (1991) J. Bacterial. 173, 6859-6864). A plasmid 
carrying an amino-terminally truncated version of CST20 
missing the first 918 nucleotides of coding sequence 
was isolated by screening for suppressors of defects in 
basal FUSl : :HIS3 expression and mating in S. cerevisiae 

30 strain YEL64 which was disrupted in STE20. A fragment 
from nucleotides 958 to 1,252 of CST20 was amplified by 
the polymerase chain reaction (PCR) and used as a probe 
to isolate a full length clone by colony hybridization 
to the C. albicans genomic library transformed into E. 

35 coli strain MC1061. Both DNA strands were sequenced by 
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the dideoxy chain termination method. The full length 
clone was subcloned between the Sad and Hindlll sites 
of the S. cerevisiae centromere plasmid pRS316 to yield 
plasmid pRL53 . 
5 Isolation o£ CaCLA4 

The S. cerevisiae MATa strain YEL257-1A-2 
deleted for STE20 and CLA4 and carrying plasmid pDH129 
with CLA4 under control of the GALI promoter was trans- 
formed with the genomic C. albicans library constructed 

10 in the S. cerevisiae vector yEp352 carrying URA3 as 
selectable marker (Boone, C. et al. (1991) J. Bacte- 
rial . 173, 6859-6864). Transf ormants were grown on 
selective medium in 4% galactose and then replica- 
plated to selective medium containing 2% glucose to 

15 select for plasmids that were able to support growth in 
the absence of Cla4p and Ste20p. By screening 1,600 
transf ormants, we isolated plasmid YEp352-CaCLA4 carry- 
ing an insert of 5.6 kb with an open reading frame of 
2,913 bp capable of encoding a homolog of Cla4p. Sub- 

20 cloning indicated that this open reading frame was 
responsible for complementation. Both DNA strands were 
sequenced by the dideoxy chain termination method. 
Molecular cloning of CaCDC42 

The S. cerevisiae MATa. strain DJTD2-16A carrying 

25 the cdc42-l^^ mutation was transformed with the genomic 
C. albicans library constructed in the 5. cerevisiae 
vector yEp352 carrying URA3 as selectable marker 
(Boone, C. et al. (1991) J. Bacterid. 173, 6859-6864). 
Transf ormants were grown on selective medium at room 

30 temperature. Colonies were then replica-plated to 
selective medium and grown at 34 °C. By screening 2,000 
transf ormants, we isolated plasmid YEp352-CaCDC42 car- 
rying an open reading frame of 573 bp capable of encod- 
ing a homolog of Cdc42p. Both DNA strands were 

35 sequenced by the dideoxy chain termination method. Sub- 
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cloning of various restriction endonuclease fragments 
indicated that the open reading frame was responsible 
for complementation of the temperature-sensitive growth 
defect caused by the cdc42-l^^ mutation. 
5 Moleculeu: cloning of CaEEMl 

The S. cerevisiae MATa. strain yEL220-lA deleted 
for BEMl and carrying plasmid pGAL-BEMl with BEMl under 
control of the GALl promoter was transformed with the 
genomic C. albicans library constructed in the S. cere- 

10 visiae vector YEp352 carrying URA3 as selectable marker 
(Boone, C- et al. (1991) J. Bacteriol. 173, 6859-6864). 
Transf ormants were grown on selective medium in 4% 
galactose and then replica-plated to selective medium 
containing 2% glucose to select for plasmids that were 

15 capable of supporting growth of Bemlp-depleted cells. 
We isolated plasmid YEp352-CaBEMl carrying an open 
reading frame of 1,905 bp fulfilling this criterion and 
capable of encoding a homolog of Bemlp. Both DNA 
strands were sequenced by the dideoxy chain termination 

20 method, and subcloning of various restriction endonu- 
clease fragments indicated that this open reading frame 
was responsible for complementation. 
Construction o£ C. albicans strains and plasmids 

To construct a CST20 null mutant, an EcdRl to 

25 Sad fragment from nucleotide positions 989 to 4,134 of 
CST20 was subcloned into the Bluescript KS(+) vector 
(Stratagene) to yield plasmid pDH119. A plasmid that 
contained C5r20-f lanking sequences from nucleotides 989 
to 1,674, and 3,423 to 4,134 joined with BamRl sites, 

30 was then created by PCR using the divergent 
oligodeoxynucleotide primers ODH68 (5*- 

CG GGATCCA GACCAACCACTCGAACTACT-3 ' (SEQ ID N0:1) and 
ODH69 ( 5 ' -CGGGATCCGAAGGTGAACCACCATATTTG-3 • (SEQ ID 
N0:2); newly introduced BamHl sites are underlined) and 

35 plasmid pDH119 as a template. The amplified DNA was 
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cleaved with Bamttl and ligated with a 4 kb BaitiHl to 
Bglll fragment of a hisG-URA3-hisG cassette derived 
from plasmid pCUB-6 (Fonzi, W. A. & Irwin, M. Y. (1993) 
Genetics 134, 717-728) to yield plasmid pDH183. This 
5 plasmid was linearized with Xhol and SacI and trans- 
formed into the Ura~ C. albicans strain CAI4 (Fonzi, W. 
A. & Irwin, M. Y. (1993) Genetics 134, 717-728) to par- 
tially replace the coding region of one of the chromo- 
somal CST20 alleles with the hisG-URA3-hisG cassette by 

10 homologous recombination. Ura"*" transformants were 
selected on Ura~ medium, and integration of the cas- 
sette into the CST20 locus was verified by Southern 
blot analysis . Spontaneous Ura~ derivatives of two of 
the heterozygous disruptants were selected on medium 

15 containing 5-f luoroorotic acid. These clones were 
screened by Southern blot hybridization to identify 
those which had lost the URA3 gene by intrachromosomal 
recombination mediated by the hisG repeats. This pro- 
cedure was then repeated to delete the remaining func- 

20 tional allele of CST20. 

A similar procedure was employed to delete the 
CaCST20 gene. A 4.6 kb Xbal fragment of YEp352-CaCLA4 
was subcloned into the pBluescript KS(+) vector 
(Stratagene) to yield plasmid pDH205. A plasmid that 

25 contained CaCLA4 flanking sequences joined with BgJiI 
sites was then created by PCR using the divergent 
oligodeoxynucleotide primers OEL109 ( 5 ' - 

G AAGATCT TGTAATCAATGTTCCCGTGGA-3 ' (SEQ ID NO: 3) and 
OELllO ( 5 ' -G AAGATCT CATCGTGATATTAAATCCGAT-3 ' (SEQ ID 

30 N0:4); newly introduced Bgill sites are underlined) and 
plasmid pDH205 as template. The amplified DNA was 
cleaved with Bgill and ligated with a 4 kb BaznHI-BgJlI 
fragment of a hisG-URAS-hisG cassette derived from 
plasmid pCUB-6 (Fonzi, W. A. & Irwin, M. Y. (1993) 

35 Genetics 134, 717-728) to yield plasmid pDH210. This 
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plasmid was linearized with Pstl and Sad and trans- 
formed into the Ura~ C. albicans strain CAI4 (Fonzi, W. 
A. & Irwin, M. Y. (1993) Genetics 134, 717-728) to 
replace the coding region of one of the chromosomal 
5 CaCLA4 alleles with the hisG-URA3-hisG cassette by 
homologous recombination. Ura"*" transf ormants were 
selected on Ura~ medium, and integration of the cas- 
sette into the CaCLA4 locus was verified by Southern 
blot analysis. Spontaneous Ura~ derivatives were then 

10 selected on medium containing 5-f luoroorotic acid. 
These clones were screened by Southern blot hybridiza- 
tion to identify those which had lost the URA3 gene by 
intrachromosomal recombination mediated by the hisG 
repeats. This procedure was then repeated to delete the 

15 remaining functional allele of CaCLA4. 

To reintegrate CST20 into the genome of mutant 
strains, the C. albicans integration plasmid pDH190 was 
constructed by subcloning a Kpnl to Pstl fragment of 
CST20 into pBS-cURA3 (pBluescript KS( + ) into which the 

20 C. albicans URA3 gene was cloned between the NotI and 
Xbal sites of the polylinker). The integration plasmid 
was then linearized with Nsil and transformed into C. 
albicans to target integration into the Nsil site of 
the CST20At zhisG fusion gene. Integrations were 

25 selected on lira" medium and confirmed by Southern blot 
analysis . 

The C. albicans CST20 expression plasmid pDH188 
was constructed by subcloning a Sad to Pstl fragment 
of CST20 into plasmid pVEC carrying a C. albicans 
30 autonomously replicating sequence and 0RA3 as 
selectable marker. The C. albicans plasmid pVEC-CaCLA4 
was constructed by subcloning the Kpnl to Sad insert 
of YEp 352-CaCLA4 into plasmid pVEC. 
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Northern blot analyses 

Northern blots of total and poly (A)"^ RNA from 
C. albicans cells were performed as described (Leberer, 
E. et al. (1992) EMBO J. 11, 4815-4824). Signals were 
5 quantified by 2-D radioimaging. 
Animal experiments 

Eight week-old, male CFW-1 mice (Halan-Winkel- 
raann, Paderborn, Germany) were inoculated with 1 x 10^ 
or 1 X 10^ cells by intravenous injection. Survival 
10 curves were calculated according to the Kaplan-Meier 
method using the PRISM™ program (GraphPad Software 
Inc., San Diego) and compared using the log-rank test. 
A P value <0.05 was considered significant. 

To quantify colony-forming C. albicans units in 
15 kidneys, mice were sacrificed by cervical dislocation 
48 hours after injection and kidneys were homogenized 
in 5 ml phosphate buffered saline, serially diluted and 
plated on YNG medium (0.67% yeast nitrogen base, 1% 
glucose, pH 7.0). Histological examination of kidney 
20 sections was done with periodic acid Schiff 's stain. 

RESULTS 

Isolation and characterization of CST20 

A C. albicans homolog of the S. cerevisiae STE20 
25 gene was cloned by functional complementation of the 
pheromone signaling defect of S. cerevisiae cells that 
were deleted for the STE20 gene. The mating defect of 
the STE20 deleted S. cerevisiae strain YEL20 was fully 
complemented by introduction of the centromeric plasmid 
30 pRL53 carrying full length CST20 (mating efficiency was 
81±9% in cells expressing CST20, compared with 85±8% in 
cells expressing STE20; n=3). Similarly, defects in 
growth arrest and morphological changes in response to 
pheromone were completely cured by transformation with 
35 the CST20 plasmid. 
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As shown in Fig. 1, nitrogen deficiency-induced 
pseudohyphae formation, which is blocked by disruption 
of STE20 in diploid cells (Liu, H., Styles, C. & Fink, 
G. R. (1993) Science 262, 1741-1744), was restored by 
5 introduction of the CST20 plasmid. Colonies of the 
diploid STE20 wild type strain L5266 (4) (Fig. lA) and 
the isogenic ste20/ste20 strain HLY492 (4) transformed 
with either the control plasmid pRS316 (Pig. IB), the 
CST20 plasmid pRL53 (Fig. IC), or the STE20 plasmid 

10 pSTE20-5 (9) (Fig. ID) were grown on nitrogen starva- 
tion medium (2) for 5 days at SCC. Photomicrographs 
were taken with a 4x objective (bar=lmm) . 

As illustrated in Fig. 2, the cytokinesis defect 
caused by deletion of CLA4, encoding an S. cexevisiae 

15 isoform of Ste20p (Cvrckova, F. et al. (1995) Genes 
Dev. 9, 1817-1830), was not complemented by CST20 (Fig. 
2). However, the lethality caused by deletion of both 
STE20 and CLA4 (Cvrckova, F. et al. (1995) Genes Dev. 
9, 1817-1830), could be rescued by CST20 (Fig. 2). The 

20 diploid strain YEL3 06 heterozygous for ste20A 
: :TRP1/STE20 cla4A: : LEU2/CLA4 was transformed with 
plasmid pRS316 carrying either no insert, CLA4 (pRL21), 
CST20 (pRL53) or STE20 (pSTE20-5), and then sporulated 
and dissected. No viable haploid ste20A cla4A spores 

25 were obtained from transf ormants with the plasmid with- 
out insert, but were obtained from transf ormants with 
plasmids carrying CLA4 (Fig. 2A), STE20 (Fig. 2B) or 
CST20 (Fig. 2C). 

Cells were grown to mid-exponential phase in YPD 

30 medium at 30 °C. No viable ste20A cla4A segregants were 
obtained in medium containing 5-f luoro-orotic acid sug- 
gesting that the plasmids were essential for viability. 
Neither STE20 nor CST20 were able to suppress the mor- 
phological defect of cla4A cells. Photomicrographs 
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were taken by phase contrast with a 4 Ox objective 
(bar=30 \m) . 

The open reading frame of CST20 is capable of 
encoding a protein of 1,229 amino acids with a pre- 
5 dieted molecular weight of 133 kDa and a domain struc- 
ture characteristic of the Ste20p/p65^^ family of 
protein kinases (Fig. 3). Numerals at the left margin 
indicate nucleotide and amino acid positions (Fig. 3). 
Nucleotide 1 corresponds to the first nucleotide of the 
10 initiation codon and amino acid 1 to the first residue 
of the deduced protein. The putative p21 binding 
domain has been shadowed, and the kinase domain has 
been boxed. 

The catalytic domain present in the carboxyl 

15 terminal half of the protein has sequence identities of 
76 and 56%, respectively, with S. cerevisiae Ste20p 
(Leberer, E. et al. (1992) EMBO J. 11, 4815-4824) and 
Cla4p (Cvrckova, F. et al. (1995) Genes Dev. 9, 1817- 
1830). The amino terminal, non-catalytic region con- 

20 tains a sequence from amino acid residues 473 to 531 
with 68% identity to the p21 binding domain of Ste20p 
that has been shown to bind the small GTPase Cdc42p. 
This region contains the sequence motif ISxPxxxxHxxH 
thought to be important for the interaction of the p21 

25 binding domain with the GTP-bound forms of Cdc42Hs and 
Racl (Cvrckova, F. et al. (1995) Genes Dev. 9, 1817- 
1830). The remaining non-catalytic sequences are less 
conserved. Unique sequences not present in Ste20p and 
the other members of the family are found at the amino 

30 terminus and between the p21 binding and catalytic 
domains . 

A CST20 transcript of 4.9 kb in size was 
detected in Northern blots. This transcript was pres- 
ent at similar levels in yeast cells grown in YPD at 
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room temperature and germ tubes induced by a tempera- 
ture shift to 37 °C. 

Isolation and characterization of CaCLA4 

A C. albicans homolog of the S. cerevisiae CLA4 
5 gene was cloned by functional complementation of the 
growth defect of S. cerevisiae cells that were deleted 
for the STE20 and CLA4 genes. 

The open reading frame of the CaCLA4 gene is 
capable of encoding a protein of 971 amino acids with a 

10 predicted molecular weight of 107 kDa and a domain 
structure characteristic of the Ste20p family of pro- 
tein kinases (Fig. 7). The catalytic domain present in 
the carboxyl terminal half of the protein has sequence 
identities of 74, 63 and 64%, respectively, with S. 

15 cerevisiae Cla4p, S. cerevisiae Ste20p and an uncharac- 
terized open reading frame present in the S. cerevisiae 
genome, 65% with the C. albicans Ste20p homolog Cst20p, 
and 61% with rat p65^^ (Fig. 7). The amino terminal, 
noncatalytic region contains a sequence from amino acid 

20 residues 69 to 180 with similarity to pleckstrin homol- 
ogy (PH) domains and a sequence from amino acid resi- 
dues 229 to 292 with 63% identity to the Cdc42p binding 
domain of S. cerevisiae Cla4p that has been shown to 
bind the small GTPase Cdc42p (Cvrckova, F. et al. 

25 (1995) Genes Dev. 9, 1817-1830). The remaining non- 
catalytic sequences are less conserved. 
Chromosomal deletion of CST20 

Homologous recombination was used in a multistep 
procedure to partially delete CST20 in a URA~ C. albi- 

30 cans strain (Fig. 4A) . PGR with the divergent oligode- 
oxynucleotides ODH68 and ODH69 was used to partially 
delete the coding sequence of CST20. A hisG-URA3-hisG 
cassette was then inserted. The deletion was confirmed 
by Southern blot analyses (Fig. 4B). The genomic DNA 

35 samples digested with Xhol were from following strains: 
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Lane #1, CAI4 {ura3/ura3 CST20/CST20) ; lane 2, CDH15 
Iura3/ura3 CST20/cst20A: :hisG-URA3-hisG) ; lane 3, CDH18 
(ura3/ura3 CST20/cst20A: :hisG) ; lane 4, CDH22 
{ura3/ura3 cst20A: :hisG-URA3-hisG/cst20A: zhisG) ; lane 
5 5, CDH25 (ura3/ura3 cst20A: :hisG/cst20A: zhisG) . North- 
ern blots showed that the CST20 transcript was absent 
in the corresponding homozygous deletion strains. 

The lateral outgrowth of hyphae from colonies 
grown on solid "Spider" media containing mannitol or 
10 sorbitol was completely blocked by deletion of CST20 
(Fig. 5B). 

Mycelial formation was drastically reduced when 
the media contained galactose, mannose or raffinose. 
The mutant strains regained the ability to form hyphae 

15 when wild type CST20 was reintroduced by transformation 
with the CST20 expression plasmid pDH188 or rein- 
tegrated into the genome by targeted homologous recom- 
bination (Fig. 5C). The CST20 transcript was detected 
in these strains by Northern blot analysis. 

20 Mutant strains formed hyphae when colonies were 

grown on "Spider" media containing either glucose or N- 
acetyl glucosamine. Normal hyphae formation was also 
observed on rice agar and on agar containing Lee ' s 
medium or 10% serum. The frequency of germ-tube forma- 

25 tion in either liquid Lee's medium, 10% serum or liquid 
"Spider" media containing any of the sugars tested 
above, were also normal. These results indicate that 
Cst20p is not required for hyphae formation under all 
conditions but are involved in the lateral formation of 

30 mycelia on some solid surfaces. 
Chromosomal deletion of Ca.CLA4 

Homologous recombination was used in a multistep 
procedure to delete both alleles of CaCLA4 in C. albi- 
cans (Fig. 8A). Fig. 8A shows the restriction endonu- 

35 clease map of CaCLA4. The coding sequence is indicated 
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by the arrow. PGR with the divergent oligodeoxynucleo- 
tides OEL109 and OELllO was used to delete the coding 
sequence of CaCLA4. A hisG-URA3-hisG cassette was then 
inserted and a two-step procedure was used to delete 
5 both alleles of CaCLA4 by homologous recombination. The 
endonuclease restriction sites are as follows: B, 
BamHI; Bg, Bgill; E, EcoRl; H, Hindlll; P, Pstlj S, 
Sad; X, Xbal. The deletions were confirmed by South- 
ern blot analyses (Pig. 8B). Southern blot analysis 

10 with a 1.1 kb CaCLA4 fragment from Pstl-Xbal as a 
probe. The genomic DNA samples digested with JScoRI were 
from following strains: Lanes: 1, CAI4 (ura3/ura3 
CaCLA4/CaCLA4) ; 2, CDH77 iuraS/uraS CaCLA4/cacla4A 
: :hisG-URA3-hisG) ; 3, CDH88 {ura3/ura3 CaCLA4/cacla4A 

15 ::hisG)'. 4, CLJl (ura3/ura3 cacla4A: :hisG-URA3- 
hisG/cacla4A: zhisG) ; and 5, CLJ5 {ura3/ura3 cacla4A 
: :hlsG/cacla4A: :hisG) . Northern blots showed that the 
CaCLA4 transcript with a size of 4.1 kb was reduced to 
about 40% in heterozygous CaCLA4/cacla4A cells and was 

20 absent in homozygous cacla4A/cacla4A deletion cells 
(Fig. 8C). The transcript was present at about wild- 
type levels when the CaCLA4 gene was retransf ormed into 
the homozygous deletion cells by using an autonomously 
replicating plasmid carrying the CaCLA4 gene (Fig. 8C). 

25 Northern blot analysis of poly{A)+ RNA isolated from 
following strains grown in the yeast form in YPD at 
30°C: Lanes: 1, SC5314 (wild-type); 2, CDH88; 3, CLJ5 
transformed with pVEC; 4, CLJ5 transformed with pVEC- 
CaCLA4. The blot was probed with fragments specific for 

30 CaCLA4 (upper panel) or CaACTl (lower panel) and quan- 
tified by radioimaging. Numbers at the bottom of the 
figure depict the relative amounts of CaCLA4 transcript 
in relation to the amounts of CaACTl transcript (mean 
values of two independent experiments ) . 
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We found that viability of C. albicans cells was 
not affected by deleting either one or both alleles of 
CaCLA4. Mutant cells showed the same growth behavior as 
wild-type cells, independently whether the cells were 
5 grown under conditions favoring either the yeast or 
filamentous forms. However, deletion of both CaCLA4 
alleles generated defects in cellular morphology pro- 
ducing a heterogeneous population of aberrantly shaped 
cells that were frequently multibudded and multinucle- 
10 ated. This phenotype indicates a defect in cytokinesis 
resembling the phenotype of S. cerevisiae cells deleted 
for CLA4 (Cvrckova, F. et al. (1995) Genes Dev. 9, 
1817-1830) . 

Deletion of both CaCLA4 alleles caused defects 

15 in hyphal formation in all media and under all condi- 
tions that we investigated. When morphological switch- 
ing was induced in liquid media by either serum, N-ace- 
tyl glucosamine, proline, pH increase, temperature 
shift, or Lee's medium, wild-type cells and cells 

20 deleted for only one or both alleles of CaCLA4 produced 
germ tubes after about 30 minutes. In wild-type cells 
and cells deleted for only one allele of CaCLA4, these 
germ tubes elongated and grew into long hyphae after 
prolonged incubation. Cells deleted for both alleles of 

25 CaCLA4 failed to produce hyphae, however. Instead, 
these cells produced multiple short protrusions giving 
rise to an aberrant morphology. 

On solid media containing either serum, rice 
agar or mannitol, the normal formation of mycelia was 

30 completely suppressed by deletion of both CaCLA4 
alleles . This phenotype was reversed by introducing the 
CaCLA4 gene on a plasmid, and deletion of only one 
allele had no effect. 
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Virulence studies 

To determine the role of Cst20p for virulence, 
mice were injected intravenously with wild type and 
mutant strains and monitored for survival and for fun- 
5 gal invasion into kidneys. We found that the Ura~ 
strain CAI4 was not pathogenic (Figs. 6A and B). How- 
ever, infection with Ura"*" wild type cells resulted in 
rapid mortality with a rate that was dependent on the 
dose of injected cells (1 x 10^ cells in Fig. 6A, and 1 

10 X 10^ cells in Fig. 6B). Survival was significantly 
prolonged, however, in mice infected with Ura"*" cells 
deleted for both alleles of CST20 {cst20A/cst20A 
::URA3). This effect, which was reproducible and sta- 
tistically significant, was observed at high (Fig- 6A) 

15 or low (Fig. 63) doses of infection (with P values of 
0-027 and 0.001, respectively) and correlated with col- 
ony-forming units per kidney (1-5 x 10^ for wild type 
cells and 7 x 10^ for cst20A/cst20A: : URA3 mutant cells) 
after 48 hours of infection with 1 x 10^ cells. These 

20 effects on virulence could be reversed by reintroducing 
CST20 into the strain deleted for both CST20 alleles, 
and were not observed in Ura"*" cells deleted for only 
one CST20 allele. A histological examination revealed 
that cells deleted for both alleles of CST20, were able 

25 to form hyphae in infected kidneys (Fig. 6C) . 

To investigate whether CaCla4p is required for 
virulence, mice were injected intravenously with wild- 
type and mutant C. albicans strains and monitored for 
survival and for fungal invasion into kidneys. Infec- 

30 tions with CaCLA4 wild-type cells (strain SC5314) 
resulted in rapid mortality (Fig. 9). No difference in 
the mortality rate was observed after infection with 
cells deleted for only one allele of CaCLA4 (strain 
CDH77)- All mice survived, however, after infection 

35 with cells deleted for both alleles of CaCLA4 (strain 
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CLJl and CLJSpVECl). This effect correlated with a 
reduction in the amount of colony-forming units per 
kidney of infected animals and was reversed by trans- 
formation of the cells with a plasmid carrying the 
5 CaCLA4 gene ( strain CLJ5CaCLA4 ) ( Fig . 9 ) . A histologi- 
cal examination revealed that kidneys from mice 
injected with either wild-type cells or cells deleted 
for one allele of CaCLA4 were heavily infected with C. 
albicans cells that produced hyphae densely penetrating 

10 the animal tissue (Fig. 10, left panel), whereas kid- 
neys from mice injected with cells deleted for both 
CaCLA4 alleles contained small foci of aberrantly 
shaped cells that frequently carried multiple protru- 
sions (Fig. 10, right panel). The morphologies of these 

15 cells were similar to those induced by serum under in 
vitro conditions. Thus, the function of CaCla4p is 
required for morphological switching of C. albicans 
under in vitro and in vivo conditions and for viru- 
lence. 

20 Moleculcu: cloning of the CaCDC42 and CaEEMl genes 

A C. albicans homolog of the CaCDC42 gene was 
cloned by functional complementation of the tempera- 
ture-sensitive growth defect of S. cerevisiae cells 
carrying the cdc42-l^^ mutation. The growth defect was 

25 fully complemented by plasmid YEp352-CaCDC42. The open 
reading frame of the CaCDC42 gene is capable of encod- 
ing a protein of 191 amino acids with homology to the 
Rho-family of small G-proteins (Fig. 11). The highest 
homology is found with Cdc42p from S. cerevisiae. 

30 AC. albicans homolog of the CaBEMl gene was 

cloned by functional complementation of the growth 
defect of S. cerevisiae cells deleted for the BEMl 
gene. This defect was fully complemented by plasmid 
yEp352-CaBEMl carrying the CaBEMl gene. The open read- 

35 ing frame of the CaBEMl gene is capable of encoding a 
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protein of 635 amino acids with a domain structure 
characteristic of Bemlp (Fig. 12). CaBemlp contains two 
conserved SH3 domains which are most homologous to the 
SH3 domains of Bemlp, and also has homology to Bemlp 
5 outside of the SH3 domains. 
Discussion 

In S. cerevisiae, Ste20p fulfills multiple func- 
tions during mating (Leberer, E. et al. (1992) EMBO J. 
11, 4815-4824), pseudohyphae formation (Liu, H., 

10 Styles, C. & Fink, G. R. (1993) Science 262, 1741- 
1744), invasive growth (Roberts, R. L. & Fink, G. R. 
(1994) Genes Dev. 8, 2974-2985) and cytokinesis 
(Cvrckova, F. et al. (1995) Genes Dev. 9, 1817-1830). 
CST20 expression in S. cerevisiae fully complements 

15 these functions. Thus, Cst20p has the potential to 
fulfill similar functions in C. albicans. 

The yeast-to-hyphal transition of C. albicans is 
a morphological change that can be triggered by a wide 
variety of factors. Carbohydrates, amino acids, salts, 

20 and serum have been described as inducers of germ tube 
formation, as have pH changes, temperature increases 
and starvation, but no single environmental factor 
could be defined as uniquely significant in stimulating 
the morphological switch. Hence C. albicans appears 

25 capable of responding to many divergent environmental 
signals. Disruption of both CPHl alleles, which encode 
a homolog of the S. cerevisiae Stel2p transcription 
factor (Liu, H. et al. (1994) Science 266, 1723-1726), 
suppressed the lateral formation of mycelia from colo- 

30 nies grown on solid "Spider" medium, but did not block 
hyphal development in other media. We have shown that 
C. albicans mutant cells deleted for CST20 display a 
similar phenotype, and that the effect of these muta- 
tions on hyphal development is dependent on the carbon 

35 source in which the cells were grown. 
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These observations are consistent with the idea 
that several signaling pathways can trigger morphogene- 
sis in C. albicans. Furthermore, the behavior of 
C. albicans mutant strains deleted for either CPHl or 
5 CST20 indicates that these pathways might operate inde- 
pendently to activate hyphal development under differ- 
ing environmental conditions. C. albicans encounters a 
variety of different microenvironments during the 
development of superficial and systemic infections. 
10 Hence, the existence of parallel morphogenetic signal- 
ing pathways might provide a distinct advantage to this 
pathogen. 

Our results indicate that the pathway controlled 
by Cst20p is not essential for virulence in a mouse 

15 model of systemic infections. It is not inconceivable 
that this pathway plays a role in other forms of infec- 
tions, for example in the development of superficial 
infections of the mucosal epithelia (thrush). An as 
yet undefined role of Cst20p in pathogenicity outside 

20 of the Cst20 signaling pathway is suggested, however, 
by prolonged survival of mice infected with cst20 
deleted cells. It is unlikely that this effect is 
caused by defects in hyphal formation since a his- 
tological examination of infected kidneys revealed that 

25 the CST20 deleted cells are not restricted in their 
capacity to form hyphae. 

In 5. cerevisiae, Cla4p plays a role in cytoki- 
nesis and shares with Ste20p an essential function for 
polarized growth during budding (Cvrckova, F. et al. 

30 (1995) Genes Dev. 9, 1817-1830). Cla4p binds the Rho- 
like small G-protein Cdc42p (Cvrckova, F. et al. (1995) 
Genes Dev. 9, 1817-1830) which is involved in control- 
ling cell polarity during budding and in response to 
pheromone. Like Ste20p and the mammalian homolog p21- 

35 activated kinase (p65^^), Cla4p is able to phosphory- 
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late and activate myosin-I, a mechanism that may con- 
tribute to the organization of the actin cytoskeleton. 

Our finding that CaCLA4 expression in S. cere- 
visiae completely complements the Cla4p functions sug- 
5 gests that CaCla4p may have similar properties in C. 
albicans. Thus, CaCla4p may be required for myosin-I 
driven polarized growth during hyphal formation in a 
mechanism that may involve the C. albicans homolog of 
Cdc42p. Our complementation assays in S. cerevisiae 

10 suggest that CaCla4p may share an essential function 
with Cst20p, the C. albicans homolog of Ste20p 
(Figs- 6A and 6B). This notion suggests, together with 
our findings that null mutants of CaCLA4 are completely 
non-pathogenic (Fig. 10) and null mutants of CST20 are 

15 reduced in virulence (Figs. 6A and 6B), that CaCla4p 
and Cst20p, and proteins such as CaCdc42p and CaBemlp 
interacting with these protein kinases, may be valid 
targets for the development of antifungal agents. 

The present invention will be more readily un- 

20 der stood by referring to the following examples which 
are given to illustrate the invention rather than to 
limit its scope. 

EXAMPLE I 

Screening test for inhibitors o£ CaCla4p and Cst20p 

25 An in vitro assay containing the proteins 

CaCla4p and/or Cst20p will be used to test compounds 
inhibiting their activity to render avirulent any 
fungi, which may be pathogenic. 

The activity of the protein will be monitored to 

30 determine if the compounds tested do inhibit their bio- 
logical activity, using myelin basic protein as a sub- 
strate . 

In cases were a selective inhibition of CaCla4p 
and Cst20p and not to p65^^ would be desired, com- 
35 pounds testing positive for the inhibition of both 



wo 98/18927 



- 25 - 



PCT/CA97/00809 



CaCla4p and Cst20p will be tested to determine if they 
also inhibit the protein p65^^. This would be useful 
in cases of pathogenic fungi infection such as for C. 
albicans were the fungi is to be rendered avirulent 
5 without affecting the normal protein of the patient 

In some cases of inflammation, it would be 
desirable to be provided with compounds inhibiting all 
three proteins, namely, CaCla4p, Cst20p and p65^^. 

10 

EXAMPLE II 

Screening test for inhibitors of CaCla4p and CaCdc42p 

interactions 

15 An in vitro assay containing the proteins 

CaCla4p and CaCdc42p will be used to test compounds 
inhibiting their interactions. 

CaCla4p may be solid phase bound and CaCdc42p 
will be in suspension free to interact with CaCla4p. A 

20 labeled antibody specific to CaCdc42p will be added to 
the assay to determine the presence of CaCdc42p bound 
to CaCla4p. The compounds tested to inhibit the 
CaCdc42p-CaCla4p interactions, should when tested posi- 
tive, cause only a minute quantity of CaCdc42p to bind 

25 to CaCla4p interactions. 

The analogous in vitro assay will be used to 
test compounds that inhibit the interaction between 
Cst20p and CaCdc42p. 

30 EXAMPLE III 

Screening test for inhibitors of CaCla4p and CaBemlp 

interactions 

An in vitro assay containing the proteins 
35 CaCla4p and CaBemlp will be used to test compounds 
inhibiting their interactions. 

CaCla4p may be solid phase bound and CaBemlp 
will be in suspension free to interact with CaCla4p. A 
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labeled antibody specific to CaBemlp will be added to 
the assay to determine the presence of CaBemlp bound to 
CaCla4p. The compounds tested to inhibit the CaBemlp- 
CaCla4p interactions, should when tested positive, 
5 cause only a minute quantity of CaBemlp to bind to 
CaCla4p interactions . 

The analogous in vitro assay will be used to 
test compounds that inhibit the interaction between 
Cst20p and CaBemlp. 

10 

EXAMPLE IV 

A two-hybrid CaCdc42p and CaCla4p interaction system in 
a humanized £>. cerevxslae strain 

15 This screening assay is based on the assumption 

that the interaction of the small G-protein CaCdc42p 
with its cellular targets Cst20p and CaCla4p is essen- 
tial for viability of C albicans cells. This essential 
function is reasonable to assume based on work that has 

20 been performed in 5. cerevisiae (Leberer E. et al. 
(1997) Embo J. 16, 83-97). The two hybrid interaction 
system will use green fluorescent protein fused to the 
GALl promoter as a functional read out. This reporter 
gene will be integrated into a S. cerevisiae strain in 

25 which the STE20 and CLA4 genes have been replaced by 
the human homolog p65PAK. The CaCDC42 gene will be 
fused to the DNA binding domain of GAL4, and the CaCLA4 
gene will be fused to the activation domain of GAL4. 
Interaction of the two proteins will cause green fluo- 

30 rescence. Whereas inhibitors of the interaction will 
suppress fluorescence. 

Non-specific inhibitors of the two-hybrid inter- 
action system will be excluded by performing a parallel 
screen with unrelated fusion proteins known to inter- 

35 act. Compounds of general toxicity or inhibitors of 
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the human hotnologs will also be excluded in this system 
because those compounds will not allow growth of the 
cells and therefore reduce the fluorescent readout in 
both parallel screens. 
5 A two-hybrid yeast strain carrying the GAL4-GFP. 

fusion gene is constructed. This strain will be 
deleted for the CLA4 gene using the TRPl marker as 
described (Leberer E. et al. (1997) Embo J. 16, 83-97). 
The STE20 gene will be replaced by the human PAK gene 

10 as described above. To replace the CDC42 gene by its 
human homolog, an integrating plasmid will be con- 
structed carrying the HsCDC42 gene fused to a URA3 
blaster gene and CDC42 flanking sequences. After line- 
arization, the construct will be transformed into the 

15 PAK containing two-hybrid strain, and integrants will 
be selected on -ura medium. The URA3 gene will then be 
looped out on FOA medium. The various gene disruptions 
and gene replacements will be verified by Southern blot 
analyses . 

20 The two-hybrid vectors carrying the CaCDC42 gene 

fused to the GAL4-DNA binding domain and the CaCLA4 
gene fused to the transcriptional activation domain of 
GAL4 will be constructed by standard procedures . To 
facilitate the interaction of the two proteins, we will 

25 use site-directed mutagenesis to create a mutation in 
the CAAX-box domain of CaCDC42p to prevent isopren- 
ylation and targeting of the fusion protein to the 
plasma membrane. We will evaluate and optimize the 
assay system and adapt the assay conditions to the 

30 scale used in microtiter plates for automated screening 
of compounds . 
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BXAMPLE V 

Detection of the presence of C. albicans using probes 

The sequences of either one of the genes CaCLA4, 
5 CST20, CaCDC42 and CaBEMl may be used to derive probes 
for the detection of C. albicans using PGR techniques 
or hybridization assays. 

EXAMPLE VI 

10 

Use of nucleotide sequences of CaCLA4, CST20, CaCDC42 
and CaEEMl to identify homologue from other fungi 

The nucleotide sequences of CaCLA4, CST20, 

CaCDC42 and CaBEMl may be used to identify and clone 

15 homologues from other fungi. 

EXAMPLE VII 

A S. cerevlsaae-based screening system using CaSte20p 
20 and the pheromone signaling pathway as drug target 

In this system, we will use green fluorescent 
protein (GFP) under transcriptional control of a phero- 
mone inducible promoter {FUSD as a read out. The 
pheromone signaling pathway and thereby the reporter 

25 gene will be induced with pheromone in two different 
strains. First, in a strain in which STE20 is func- 
tionally replaced by the CaSTE20 gene. And second, in 
a strain in which STE20 is functionally replaced by the 
mammalian homolog PAK. Compounds that block the induc- 

30 tion of the reporter gene in the CaSTE20 strain but not 
in the PAK strain are expected to be specific inhibi- 
tors of the C. albicans kinase. This assay is very 
specific and is a positive selection of compounds that 
excludes the finding of compounds with inhibitory 

35 action against the mammalian homolog PAK or compounds 
of general toxicity. 

The FUSl gene, including its promoter, will be 
isolated by the polymerase chain reaction (PGR) from 
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genomic DNA of S. cerevisiae and fused to the 6FP gene 
from Aequoria victoria on a yeast expression plasmid. 
The function of the reporter gene will be analyzed 
after transformation of a MATa yeast strain and induc- 
5 tion with pheromone. 

The STE20 gene will be replaced in a supersensi- 
tive sstl yeast strain by the human PAK gene using 
homologous recombination. For this purpose, an inte- 
grating plasmid will be constructed carrying the PAK 

10 gene fused to a URA3 blaster gene and STE20 flanking 
sequences. The construct will be linearized and trans- 
formed into yeast, and integrants will be selected on - 
ura medium. The URA3 gene will then be looped out on 
FOA medium to gain back the ura3 marker. Correct inte- 

15 gration of the PAK gene will be confirmed by Southern 
blot analysis. 

The humanized strain will then be transformed 
with the FUSl-GFP reporter gene and analyzed for a 
functional signaling pathway by measuring green fluo- 

20 rescence after induction with pheromone. The assay 
system will be evaluated, optimized and adapted to the 
scale used in microtier plates . 

BXAMPLE VIII 

25 

Fluorescence resonance energy transfer (FRET) as probe 
for protein-protein interactions 

The engineering of different GFP mutants with 

altered fluorescence characteristics allows the use of 

30 fluorescence resonance energy transfer (FRET) to probe 
protein-protein interactions (Heim and Tsien (1996) 
Curr. Biol. 6, 178-182). The FRET phenomenon consists 
in a fluorescence transfer between a donor and a recep- 
tor f luorochrome . If excitation and emission wave- 

35 lengths are compatible, the FRET is easily measurable. 
The main parameter of the reaction is the distance 
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between donor and receptor, which must be in the range 
of nanometers. This is precisely the kind of values in 
protein-protein interactions - 

We propose to develop a novel yeast assay system 
5 which uses FRET to measure the in vivo interaction 
between CaCdc42p and Cacla4p. The CaCDC42 gene will be 
fused to a GFP mutant that acts as donor, and the 
CaCLA4 gene will be fused to a mutant that acts as 
receptor. The yeast strain used as an expression sys- 

10 tern will be humanized as described in Example VII. 
Inhibitors of the interaction are expected to reduce 
energy transfer, and this reduction can be easily meas- 
ured spectroscopically . The interaction of unrelated 
proteins known to interact will be used as a reference 

15 to exclude non-specific inhibitors of the assay system. 
Compounds inhibiting the interaction of the human 
homologs or of general toxicity will be excluded by 
inhibition of growth and therefore reduced fluorescence 
in both screens. 

20 The CaCDC42 gene will be fused to the gene 

encoding the gfp^^^" mutant as donor, and the CaCLA4 
gene will be fused to the gene encoding the GFP^"^ 
mutant as receptor (Heim and Tsien (1996), Curr. Biol. 
6, 178-182). The constructs will then be transformed 

25 into the humanized yeast strain described in Example 
VII, and the FRET phenomenon will be analyzed in yeast 
cultures using fluorescence spectroscopy. The condi- 
tions for the assay will be worked out and optimized. 
We will adapt the assay conditions to the scale used in 

30 microtiter plates for automated screening. 

While the invention has been described in con- 
nection with specific embodiments thereof, it will be 
understood that it is capable of further modifications 
and this application is intended to cover any varia- 

35 tions, uses, or adaptations of the invention following. 
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in general, the principles of the invention and 
including such departures from the present disclosure 
as come within known or customary practice within the 
art to which the invention pertains and as may be 
5 applied to the essential features hereinbefore set 
forth, and as follows in the scope of the appended 
claims . 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION 
(i) APPLICANT: National Research Council of Canada 

(ii) TITLE OF THE INVENTION: CANDIDA ALBICANS PROTEINS 

ASSOCIATED WITH VIRULENCE AND HYPHAL FORMATION AND USES 
THEREOF 

(iii) NUMBER OF SEQUENCES: 12 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SWABEY OGILVY RENAULT 

(B) STREET: 1981 McGill College Ave. - Suite 1600 

(C) CITY: Montreal 

(D) STATE: QC 

(E) COUNTRY: Canada 

(F) ZIP: H3A 2Y3 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Con^jatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/029,458 

(B) FILING DATE: 30-OCT-1996 



(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Cote, France 

(B) REGISTRATION NUMBER: 4166 

(C) REFERENCE/DOCKET NUMBER: 2139-lOPCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 514 845-7126 

(B) TELEFAX: 514-288-8389 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NOrl: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l 
CGGGATCCAG ACCAACCACT CGAACTACT 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 
CGGGATCCGA AGGTGAACCA CCATATTTG 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 
GAAGATCTTG TAATCAATGT TCCCGTGGA 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
GAAGATCTCA TCGTGATATT AAATCCGAT 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic RNA 



wo 98/18927 



- 34 - 



PCT/CA97/00809 



(ix) FEATURE: 

(A) NAME/ KEY: Coding Sequence 

(B) LOCATION: 355... 4044 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

GTACCCACTT TACAATCACT TACAAGTCAA ATAATTACAA CTTGACAATC CTCACTTTAA 60 

GTCTAACGTA TATACGCGTA CACCATCTTA TACTCCACAT ACATATTGGA TTCAATTTTT 120 

ATTTTATTGT TTAGTTTATA TCCAACCACT GACAATTACC AATAGTTTTC AATTAATATT 180 

CACAATTTAA CTATTTGTTT GACAGCTGAA AAGAGATAAA AAAAGAATCA AGTGCTATAA 240 

CTCACAAGGG CTAGAAATAA GTTTGCAAAA AACAAGTTTT AAAAATAGTA ACTGCACTTT 300 

TGTTGACTCT TTCACCTCCC CATTGAATTT AACTGAACAC AAATAAAGCC TATC ATG 357 



Met 

1 



AGO 
Ser 

AAT 
Asn 


ATA 
He 

GAG 
Glu 


CTT 
Leu 

TCT 
Ser 
20 


TCA 
Ser 

5 
TCT 
Ser 


GAG 
Glu 

CAT 
His 


AAC 
Asn 

CTA 
Leu 


AAT 
Asn 

CAC 
His 


CCT 
Pro 

AAC 
Asn 
25 


ACA 

Thr 

10 

CCA 

Pro 


CCA 
Pro 

GAG 
Glu 


ACA 
Thr 

TTA 
Leu 


TCA 
Ser 

AAC 
Asn 


ATA 
He 

TCT 
Ser 
30 


ACA 

Thr 

15 

GGA 

Gly 


GAT 
Asp 

ACG 
Thr 


CCA 
Pro 

AGG 
Arg 


405 
453 


GTT 
Val 


GCT 
Ala 
35 


TCT 
Ser 


GGA 
Gly 


CCT 
Pro 


GGA 
Gly 


CCT 
Pro 
40 


GGA 
Gly 


CCT 
Pro 


GAA 
Glu 


GTT 
Val 


GAA 
Glu 
45 


TCA 
Ser 


ACA 
Thr 


CCA 
Pro 


CTA 
Leu 


501 


GCA 
Ala 
50 


CCC 
Pro 


CCA 
Pro 


ACT 
Thr 


GAG 
Glu 


GTC 
Val 
55 


ATG 
Met 


AAT 
Asn 


ACT 
Thr 


ACA 
Thr 


TCA 
Ser 
60 


GCT 
Ala 


AAT 
Asn 


ACT 
Thr 


TCT 
Ser 


TCA 
Ser 
65 


549 


TTA 
Leu 


AGT 
Ser 


TTA 
Leu 


GGG 
Gly 


TCT 
Ser 
lu 


CCA 
Pro 


ATG 
Met 


CAC 
His 


GAG 
Glu 


AAA 
Lys 
/3 


ATA 
He 


AAA 
Lys 


CAA 
Gin 


TTT 
Phe 


GAT 
Asp 
oU 


CAA 
Gin 


597 


GAG 
Asp 


GAG 
Glu 


GTT 
Val 


GAC 
Asp 
85 


ACT 
Thr 


GGG 
Gly 


GAA 
Glu 


ACT 
Thr 


AAT 
Asn 
90 


GAT 
Asp 


AGG 
Arg 


ACT 
Thr 


ATA 
He 


GAA 
Glu 
95 


TCT 
Ser 


GGA 
Gly 


645 


TCT 
Ser 


AGT 
Ser 


GAT 
Asp 
100 


ATT 

He 


GAT 
Asp 


GAT 
Asp 


TCA 
Ser 


CAA 
Gin 
105 


CAA 
Gin 


TCA 
Ser 


CAT 
His 


AAC 
Asn 


AAC 
Asn 
110 


AAC 
Asn 


AAC 
Asn 


AAC 
Asn 


693 


AAC 
Asn 


AAC 
Asn 
115 


AAC 
Asn 


AAC 
Asn 


AAC 
Asn 


AAC 
Asn 


GAG 
Glu 
120 


AGC 
Ser 


AAT 
Asn 


CCA 
Pro 


GAA 
Glu 


TCA 
Ser 
125 


AGT 
Ser 


GAA 
Glu 


GGC 
Gly 


GAT 
Asp 


741 


GAT 
Asp 
130 


GAA 
Glu 


AAA 

Lys 


ACC 
Thr 


CAA 
Gin 


GGA 
Gly 
135 


ATG 
Met 


CCT 
Pro 


CCT 
Pro 


CGA 
Arg 


ATG 
Met 
140 


CCA 
Pro 


GGG 
Gly 


ACA 
Thr 


TTC 
Phe 


AAT 
Asn 
145 


789 


GTG 
Val 


AAA 
Lys 


GGT 
Gly 


TTG 
Leu 


CAC 
His 
150 


CAA 
Gin 


GGG 
Gly 


GAT 
Asp 


GAT 
Asp 


AGT 
Ser 
155 


GAC 
Asp 


AAT 
Asn 


GAA 
Glu 


AAA 
Lys 


CAG 
Gin 
160 


TAG 
Tyr 


837 


ACC 
Thr 


GAG 
Glu 


CTA 
Leu 


ACT 
Thr 
165 


AAA 
Lys 


TCA 
Ser 


ATC 
He 


AAT 
Asn 


AAA 
Lys 
170 


CGT 
Arg 


ACC 
Thr 


AGT 
Ser 


AAA 

Lys 


GAT 
Asp 
175 


TCG 
Ser 


TAT 
Tyr 


885 



wo 98/18927 



- 35 - 



PCT/CA97/00809 



TCT CCT GGC ACA CTT GAA AGT CCC GGT ACT CTT AAT GCA TTG GAA ACA 933 
Ser Pro Gly Thr Leu Glu Ser Pro Gly Thr Leu Asn Ala Leu Glu Thr 
180 185 190 

AAT AAT GTC TCA CCA GCA GTT ATA GAG GAA GAA CAA CAT ACA CTG TCT 981 
Asn Asn Val Ser Pro Ala Val lie Glu Glu Glu Gin His Thr Leu Ser 
195 200 205 

TTG GAA GAT TTG TCA TTG TCC TTA CAA CAC CAA AAT GAA AAT GCA AGA 1029 
Leu Glu Asp Leu Ser Leu Ser Leu Gin His Gin Asn Glu Asn Ala Arg 
210 215 220 225 

TTA TCT GCA CCC CGC AGT GCA CCG CCA CAG GTT CCG ACT TCA AAG ACA 1077 
Leu Ser TVla Pro Arg Ser Ala Pro Pro Gin Val Pro Thr Ser Lys Thr 
230 235 240 

TCG TCA TTT CAC GAT ATG AGT CTG GTT ATA TCT TCA TCA ACT TCT GTG 1125 
Ser Ser Phe His Asp Met Ser Leu Val lie Ser Ser Ser Thr Ser Val 
245 250 255 

CAT AAG ATA CCA TCA AAT CCA ACT TCA ACT CGA GGT TCT CAT TTA TCA 1173 
His Lys lie Pro Ser Asn Pro Thr Ser Thr Arg Gly Ser His Leu Ser 
260 265 270 

AGT TAC AAA TCT ACA TTG GAC CCT GGG AAA CCT GCA CAA GCA GCA GCA 1221 
Ser Tyr Lys Ser Thr Leu Asp Pro Gly Lys Pro Ala Gin Ala Ala Ala 
275 280 285 

CCA CCA CCA CCA GAA ATA GAC ATT GAC AAT TTA TTA ACC AAA AGT GAA 1269 
Pro Pro Pro Pro Glu lie Asp lie Asp Asn Leu Leu Thr Lys Ser Glu 
290 295 300 305 

TTG GAT CTG GAA ACA GAC ACA TTG AGT AGT GCC ACA AAT TCT CCA AAC 1317 
Leu Asp Leu Glu Thr Asp Thr Leu Ser Ser Ala Thr Asn Ser Pro Asn 
310 315 320 

CTT TTA AGA AAT GAT ACT TTA CAA GGA ATT CCA ACA AGA GAT GAC GAA 1365 
Leu Leu Arg Asn Asp Thr Leu Gin Gly lie Pro Thr Arg Asp Asp Glu 
325 330 335 

AAT ATT GAT GAC CTG CCC CGT CAA CTA TCA CAA AAT ACT AGT GCG ACG 1413 
Asn lie Asp Asp Leu Pro Arg Gin Leu Ser Gin Asn Thr Ser Ala Thr 
340 345 350 

TCA AGA AAT ACT TCG GGA ACA TCG ACT TCT ACA GTG GTG AAA AAT TCA 1461 
Ser Arg Asn Thr Ser Gly Thr Ser Thr Ser Thr Val Val Lys Asn Ser 
355 360 365 

AGA TCT GGT ACG TCA AAA TCA ACC TCA ACC TCA ACT GCT CAT AAC CAA 1509 
Arg Ser Gly Thr Ser Lys Ser Thr Ser Thr Ser Thr Ala His Asn Gin 
370 375 380 385 

ACA GCA GCA ATT ACT CCT ATA ATC CCG AGT CAC AAC AAG TTT CAT CAA 1557 
Thr Ala Ala lie Thr Pro lie lie Pro Ser His Asn Lys Phe His Gin 
390 395 400 

CAA GTG ATA AAT ACC AAT GCA ACA AAT AGT TCA TCT TCA CTA GAA CCA 1605 
Gin Val lie Asn Thr Asn Ala Thr Asn Ser Ser Ser Ser Leu Glu Pro 
405 410 415 
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TTG GGG GTT GGC ATA AAT TCA AAT CTG TCT CCT AAA ACT GGG AAA 7VAG 1653 
Leu Gly Val Gly lie Asn Ser Asn Leu Ser Pro Lys Ser Gly Lys Lys 
420 425 430 

CGG AAA AGT GGA AGT AAA GTC CGA GGT GTG TTT TCG TCA ATG TTT GGG 1701 
Arg Lys Ser Gly Ser Lys Val JVrg Gly Val Phe Ser Ser Met Phe Gly 
435 440 445 

A7^ AAC AAG TCA ACG TCA TCA TCG TCG TCT TCA AAC TCA GGT CTG AAT 1749 
Lys Asn Lys Ser Thr Ser Ser Ser Ser Ser Ser Asn Ser Gly Leu Asn 
450 455 460 465 

AGC CAC TCA CAG GAA GTC AAT ATT AAG ATC AGT ACT CCA TTC AAT GCC 1797 
Ser His Ser Gin Glu Val Asn lie Lys lie Ser Thr Pro Phe Asn Ala 
470 475 480 

AAG CAC CTT GCC CAT GTG GGC ATT GAT GAT AAT GGT TCA TAC ACC GGT 1845 
Lys His Leu Ala His Val Gly lie Asp Asp Asn Gly Ser Tyr Thr Gly 
485 490 495 

TTG CCA ATA GAG TGG GAA AGA TTA TTA TCT GCT AGT GGT ATT ACC AAG 1893 
Leu Pro lie Glu Trp Glu Arg Leu Leu Ser Ala Ser Gly lie Thr Lys 
500 505 510 

AAG GAA CAA CAA CAG CAC CCA CAA GCA GTG ATG GAT ATA GTG GCG TTT 1941 
Lys Glu Gin Gin Gin His Pro Gin Ala Val Met Asp lie Val Ala Phe 
515 520 525 

TAT CAA GAT ACA AGT GAA AAC CCT GAT GAC GCT GCA TTT AAA AAG TTT 1989 
Tyr Gin Asp Thr Ser Glu Asn Pro Asp Asp Ala Ala Phe Lys Lys Phe 
530 535 540 545 

CAT TTT GAT AAT AAT AAA AGT AGT TCG AGT GGT TGG TCT AAT GAA AAT 2037 
His Phe Asp Asn Asn Lys Ser Ser Ser Ser Gly Trp Ser Asn Glu Asn 
550 555 560 

ACT CCA CCA GCA ACA CCG GGT GGG AGT AAC AGT GGC AGT GGC AGT GGT 2085 
Thr Pro Pro Ala Thr Pro Gly Gly Ser Asn Ser Gly Ser Gly Ser Gly 
565 570 575 

GGC GGT GGC GCT CCT TCA AGT CCC CAT CGT ACA CCT CCT TCA TCG ATC 2133 
Gly Gly Gly Ala Pro Ser Ser Pro His Arg Thr Pro Pro Ser Ser lie 
580 585 590 

ATT GAA AAA AAC AAC GTT GAA CAA AAA GTG ATT ACC CCA TCT CAG TCA 2181 
lie Glu Lys Asn Asn Val Glu Gin Lys Val lie Thr Pro Ser Gin Ser 
595 600 605 

ATG CCA ACA AAG ACA GAG AGT AAA CAG CTG GAA AAC CAG CAC CCA CAT 2229 
Met Pro Thr Lys Thr Glu Ser Lys Gin Leu Glu Asn Gin His Pro His 
610 615 620 625 

GAA GAT AAT GCT ACT CAG TAT ACA CCA AGA ACA CCA ACA TCC CAT GTA 2277 
Glu Asp Asn Ala Thr Gin Tyr Thr Pro Arg Thr Pro Thr Ser His Val 
630 635 640 
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CAA GAG GGT CAA TTT ATT CCA AGT AGA CCA GCT CCG AAA CCA CCA TCA 2325 
Gin Glu Gly Gin Phe lie Pro Ser Arg Pro Ala Pro Lys Pro Pro Ser 
645 650 655 

ACA CCG CTT TCT TCC ATG AGT GTG TCA CAT AAA ACA CCT TCT TCG CAA 2373 
Thr Pro Leu Ser Ser Met Ser Val Ser His Lys Thr Pro Ser Ser Gin 
660 665 670 

TCA TTA CCA AGG AGT GAT TCA CAA TCC GAT ATT CGT TCT TCA ACC CCT 2421 
Ser Leu Pro Arg Ser Asp Ser Gin Ser Asp lie Arg Ser Ser Thr Pro 
675 680 685 

AAA TCA CAT CAA GAT GTT TCG CCA AGC AAG ATC AAA ATT CGT TCA ATT 2469 
Lys Ser His Gin Asp Val Ser Pro Ser Lys lie Lys lie Arg Ser lie 
690 695 700 705 

TCG TCA AAA TCA TTA AAG TCA ATG CGG TCT AGA AAA AGT GGG GAT AAG 2517 
Ser Ser Lys Ser Leu Lys Ser Met Arg Ser Arg Lys Ser Gly Asp Lys 
710 715 720 

TTT ACT CAT ATT GCA CCT GCT CCT CCA CCA CCA TCA TTA CCT TCA ATT 2565 
Phe Thr His lie Ala Pro Ala Pro Pro Pro Pro Ser Leu Pro Ser lie 
725 730 735 

CCT AAA TCA AAG TCG CAT TCG GCA TCT TTG TCA AGT CAA TTG AGA CCA 2613 
Pro Lys Ser Lys Ser His Ser Ala Ser Leu Ser Ser Gin Leu Arg Pro 
740 745 750 

GCA ACA AAT GGA TCA ACA ACT GCC CCT ATT CCA GCA AGT GCC GCG TTT 2661 
Ala Thr Asn Gly Ser Thr Thr Ala Pro lie Pro Ala Ser Ala Ala Phe 
755 760 765 

GGT GGT GAG AAT AAT GCT TTA CCA AAA CAA AGA ATA AAT GAG TTC AAG 2709 
Gly Gly Glu Asn Asn Ala Leu Pro Lys Gin Arg lie Asn Glu Phe Lys 
770 775 780 785 

GCT CAT AGA GCA CCT CCA CCA CCT CCA CTG GCA CCA CCT GCA CCA CCT 2757 
Ala His Arg Ala Pro Pro Pro Pro Pro Leu Ala Pro Pro Ala Pro Pro 
790 795 800 

GTG CCT CCT GCT CCA CCA GCC AAT TTA TTA TCG GAA CAG ACT TCT GAG 2805 
Val Pro Pro Ala Pro Pro Ala Asn Leu Leu Ser Glu Gin Thr Ser Glu 
805 810 815 

ATA CCT CAA CAA CGT ACT GCT CCT CTG CAA GCA TTA GCT GAT GTT ACT 2853 
lie Pro Gin Gin Arg Thr Ala Pro Leu Gin Ala Leu Ala Asp Val Thr 
820 825 830 

GCC CCA ACT AAT ATT TAT GAA ATT CAA CAA ACT AAA TAT CAG GAA GCA 2901 
Ala Pro Thr Asn lie Tyr Glu lie Gin Gin Thr Lys Tyr Gin Glu Ala 
835 840 845 

CAA CAG AAA TTA CGT GAG AAG AAG GCT AGA GAA CTT GAA GAA ATA CAA 2949 
Gin Gin Lys Leu Arg Glu Lys Lys Ala Arg Glu Leu Glu Glu lie Gin 
850 855 860 865 

AGA CTA CGA GAG AAG AAT GAA AGA CAA AAT AGA CAA CAG GAG ACT GGG 2997 
Arg Leu 7\xg Glu Lys Asn Glu Arg Gin Asn Arg Gin Gin Glu Thr Gly 
870 875 880 
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CAA AAT AAT GCT GAC ACG GOT AGC GGT GGT AGT AAT ATT GCT CCA CCA 3045 
Gin Asn Asn Ala Asp Thr Ala Ser Gly Gly Ser Asn lie Ala Pro Pro 
885 890 895 

GTA CCT GTA CCA AAT AAA AAA CCG CCT TCT GGA TCT GGT GGT GGC CGT 3093 
Val Pro Val Pro Asn Lys Lys Pro Pro Ser Gly Ser Gly Gly Gly Arg 
900 905 910 

GAT GCC AAA CAA GCA GCT TTG ATA GCC CAA AAG AAA CGA GAA GAA AAG 3141 
Asp Ala Lys Gin Ala Ala Leu lie Ala Gin Lys Lys Arg Glu Glu Lys 
915 920 925 

AAA CGT AAA AAC TTA CAA ATT ATT GCC AAA TTA AAG ACA ATT TGT AAT 3189 
Lys Arg Lys Asn Leu Gin lie lie Ala Lys Leu Lys Thr lie Cys Asn 
930 935 940 945 

CCT GGA GAT CCA AAT GAA TTA TAT GTT GAT TTA GTT AAA ATT GGT CAA 3237 
Pro Gly Asp Pro Asn Glu Leu Tyr Val Asp Leu Val Lys lie Gly Gin 
950 955 960 

GGT GCC TCC GGT GGA GTT TTC CTT GCT CAT GAT GTT CGT GAT AAA TCC 3285 
Gly Ala Ser Gly Gly Val Phe Leu Ala His Asp Val Arg Asp Lys Ser 
965 970 975 

T^T ATT GTT GCC ATA AAA CAA ATG AAT TTA GAA CAA CAA CCT AAA AAA 3333 
Asn lie Val Ala lie Lys Gin Met Asn Leu Glu Gin Gin Pro Lys Lys 
980 985 990 

GAA TTA ATT ATT 7WVT GAA ATT TTG GTT ATG AAA GGT AGT CTG CAT CCC 3381 
Glu Leu lie lie Asn Glu lie Leu Val Met Lys Gly Ser Leu His Pro 
995 1000 1005 

AAT ATT GTC AAT TTT ATT GAT TCA TAT CTT TTA AAA GGT GAT TTA TGG 3429 
Asn lie Val Asn Phe lie Asp Ser Tyr Leu Leu Lys Gly Asp Leu Trp 
1010 1015 1020 1025 

GTG ATT ATG GAA TAT ATG GAA GGT GGA TCC CTT ACC GAT ATA GTG ACT 3477 
Val lie Met Glu Tyr Met Glu Gly Gly Ser Leu Thr Asp lie Val Thr 
1030 1035 1040 

CAT AGT GTT ATG ACC GAA GGT CAA ATT GGA GTT GTA TGT CGT GAA ACT 3525 
His Ser Val Met Thr Glu Gly Gin lie Gly Val Val Cys Arg Glu Thr 
1045 1050 1055 

TTG AAA GGT CTT AAA TTT TTA CAT TCC AAA GGG GTT ATC CAT CGT GAT 3573 
Leu Lys Gly Leu Lys Phe Leu His Ser Lys Gly Val lie His Arg Asp 
1060 1065 1070 

ATT AAA TCC GAT AAT ATT TTA TTA AAT ATG GAT GGT AAC ATC AAG ATC 3621 
lie Lys Ser Asp Asn lie Leu Leu Asn Met TVsp Gly Asn lie Lys lie 
1075 1080 1085 

ACT GAT TTT GGG TTT TGT GCT CAA ATC AAT GAA ATC AAT CTG AAA CGT 3669 
Thr Asp Phe Gly Phe Cys Ala Gin lie Asn Glu lie Asn Leu Lys Arg 
1090 1095 1100 1105 
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ATC ACT ATG GTG GGT ACA CCA TAT TGG ATG GCA CCA GAA ATT GTT TCA 
lie Thr Met Val Gly Thr Pro Tyr Trp Met Ala Pro Glu lie Val Ser 
1110 1115 1120 



3717 



CGT AAA GAG TAT GGT CCA AAA GTT GAT GTT TGG TCA TTA GGT ATC ATG 
Arg Lys Glu Tyr Gly Pro Lys Val Asp Val Trp Ser Leu Gly lie Met 
1125 1130 1135 



3765 



ATT ATA GAA ATG TTA GAA GGT GAA CCA CCA TAT TTG AAT GAA ACT CCA 
lie lie Glu Met Leu Glu Gly Glu Pro Pro Tyr Leu Asn Glu Thr Pro 
1140 1145 1150 



3813 



TTG AGG GCA TTA TAT CTT ATT GCA ACT AAT GGT ACA CCA AAA TTA AAA 
Leu Arg Ala Leu Tyr Leu lie Ala Thr Asn Gly Thr Pro Lys Leu Lys 
1155 1160 1165 



3861 



GAT CCT GAA TCT TTA AGT TAT GAT ATT AGA AAA TTT TTG GCA TGG TGT 
Asp Pro Glu Ser Leu Ser Tyr Asp lie Arg Lys Phe Leu Ala Trp Cys 
1170 1175 1180 1185 



3909 



TTA CAA GTT GAC TTT AAT AAA AGA GCT GAT GCT GAT GAA TTA TTA CAT 
Leu Gin Val Asp Phe Asn Lys Arg Ala Asp Ala Asp Glu Leu Leu His 
1190 1195 1200 



3957 



GAT AAT TTT ATT ACT GAA TGT GAT GAT GTA TCG TCG TTA AGT CCA TTA 
Asp Asn Phe lie Thr Glu Cys Asp Asp Val Ser Ser Leu Ser Pro Leu 
1205 1210 1215 



4005 



GTG AAA ATT GCT CGA TTG AAA AAA ATG AGT GAA TCT GAT TAATGAATGG TG 
Val Lys lie Ala Arg Leu Lys Lys Met Ser Glu Ser Asp 
1220 1225 1230 



4056 



GAGTTATCCT 
TTCTACTGCT 
TATTCTTTGA 
TTTTTATATT 
TTTCTTTTCT 
AATATATTAT 
GTCTTGAACA 
TAGGTTTATT 



AGAAATAAAA 
GTCAATATAT 
ATTATTATTG 
TGTATTTATA 
GTGTAGATGA 
AGCTTGACTA 
AACGTTACCA 
GAGCTC 



ACATTTAAAA 
TGGCTAATTT 
TTAGTGGTAG 
TATATATTTT 
TATGTAGTAA 
TATAAGGTGG 
GATTTCTGCT 



AAAAAGAAGA 
CCATTCTCGT 
AGATTTTTAC 
TCATTTAGTA 
TAAGTTAACT 
AGAGCTGTAA 
ATTCTTATTT 



AGAACAACAA 
TTCTATTTCT 
TAGTATATTT 
TTTACTTACA 
TGTTCAAGAC 
TTGGCTTTCC 
GGTACGATTC 



GAACCCTAAA 
ATTTCGTTTT 
TTTTTATTCA 
CTGCAGTATC 
AGTGAATGGA 
GTATAGATW^ 
GGGCGTATGA 



4116 
4176 
4236 
4296 
4356 
4416 
4476 
4492 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1230 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ser lie Leu Ser Glu Asn Asn Pro Thr Pro Thr Ser He Thr Asp 

15 10 15 

Pro Asn Glu Ser Ser His Leu His Asn Pro Glu Leu Asn Ser Gly Thr 
20 25 30 
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Arg Val 


Ala 


Ser 


Gly 


Pro 


Gly 


Pro 


Gly 


Pro 


Glu 


Val 


Glu 


Ser 


Thr 


Pro 






35 










40 










45 








Leu 


Ala 
50 


Pro 


Pro 


Thr 


Glu 


Val 
55 


Met 


Asn 


Thr 


Thr 


Ser 
60 


Ala 


Asn 


Thr 


Ser 


Ser 


Leu 


Ser 


Leu 


Gly 


Ser 


Pro 


Met 


His 


Glu 


Lys 


He 


Lys 


Gin 


Phe 


Asp 


65 










70 










75 










80 


Gin Asp 


Glu 


Val 


Asp 


Thr 


Gly 


Glu 


Thr 


Asn 


Asp 


Arg 


Thr 


He 


Glu 


Ser 










85 










90 










95 




Gly Ser 


Ser 


Asp 


He 


Asp 


Asp 


Ser 


Gin 


Gin 


Ser 


His 


Asn 


TVsn 


Asn 


Asn 








100 










105 










110 






Asn 


Asn 


Asn 


Asn 


Asn 


Asn 


Asn 


Glu 


Ser 


Asn 


Pro 


Glu 


Ser 


Ser 


Glu 


Gly 






115 










120 










125 






Asp Asp 


Glu 


Lys 


Thr 


Gin 


Gly 


Met 


Pro 


Pro 


Arg 


Met 


Pro 


Gly 


Thr 


Phe 




130 










135 










140 










Asn 


Val 


Lys 


Gly 


Leu 


His 


Gin 


Gly 


Asp 


Asp 


Ser 


Asp 


Asn 


Glu 


Lys 


Gin 


145 










150 










155 










160 


Tyr 


Thr 


Glu 


Leu 


Thr 
165 


Lys 


Ser 


He 


Asn 


Lys 
170 


Arg 


Thr 


Ser 


Lys 


Asp 
175 


Ser 


Tyr 


Ser 


Pro 


Gly 
180 


Thr 


Leu 


Glu 


Ser 


Pro 
185 


Gly 


Thr 


Leu 


Asn 


Ala 
190 


Leu 


Glu 


Thr 


Asn 


Asn 
195 


Val 


Ser 


Pro 


Ala 


Val 
200 


He 


Glu 


Glu 


Glu 


Gin 
205 


His 


Thr 


Leu 


Ser 


Leu 
210 


Glu 


Asp 


Leu 


Ser 


Leu 
215 


Ser 


Leu 


Gin 


His 


Gin 
220 


Asn 


Glu 


Asn 


Ala 


Arg 


Leu 


Ser 


Ala 


Pro 


Arg 


Ser 


Ala 


Pro 


Pro 


Gin 


Val 


Pro 


Thr 


Ser 


Lys 


225 










230 










235 










240 


Thr 


Ser 


Ser 


Phe 


His 
245 


Asp 


Met 


Ser 


Leu 


Val 
250 


He 


Ser 


Ser 


Ser 


Thr 
255 


Ser 


Val 


His 


Lys 


He 
260 


Pro 


Ser 


Asn 


Pro 


Thr 
265 


Ser 


Thr 


Arg 


Gly 


Ser 
270 


His 


Leu 


Ser 


Ser 


Tyr 
275 


Lys 


Ser 


Thr 


Leu 


Asp 
280 


Pro 


Gly 


Lys 


Pro 


Ala 
285 


Gin 


Ala 


Ala 


Ala 


Pro 
290 


Pro 


Pro 


Pro 


Glu 


He 
295 


Asp 


He 


Asp 


Asn 


Leu 
300 


Leu 


Thr 


Lys 


Ser 


Glu 


Leu 


Asp 


Leu 


Glu 


Thr 


Asp 


Thr 


Leu 


Ser 


Ser 


Ala 


Thr 


Asn 


Ser 


Pro 


305 








310 










315 










320 


Asn 


Leu 


Leu 


Arg 


Asn 
325 


Asp 


Thr 


Leu 


Gin 


Gly 
330 


He 


Pro 


Thr 


Arg 


Asp 
335 


Asp 


Glu 


Asn 


He 


Asp 
340 


Asp 


Leu 


Pro 


Arg 


Gin 
345 


Leu 


Ser 


Gin 


Asn 


Thr 
350 


Ser 


Ala 


Thr 


Ser 


Arg 
355 


Asn 


Thr 


Ser 


Gly 


Thr 
360 


Ser 


Thr 


Ser 


Thr 


Val 
365 


Val 


Lys 


Asn 


Ser Arg 


Ser 


Gly 


Thr 


Ser 


Lys 


Ser 


Thr 


Ser 


Thr 


Ser 


Thr 


Ala 


His 


Asn 




370 










375 










380 










Gin 


Thr 


Ala 


Ala 


He 


Thr 


Pro 


He 


He 


Pro 


Ser 


His 


Asn 


Lys 


Phe 


His 


385 










390 










395 










400 


Gin 


Gin 


Val 


He 


Asn 
405 


Thr 


Asn 


Ala 


Thr 


Asn 
410 


Ser 


Ser 


Ser 


Ser 


Leu 
415 


Glu 


Pro 


Leu 


Gly 


Val 
420 


Gly 


He 


Asn 


Ser 


Asn 
425 


Leu 


Ser 


Pro 


Lys 


Ser 
430 


Gly 


Lys 


Lys 


Arg 


Lys 
435 


Ser 


Gly 


Ser 


Lys 


Val 
440 


Arg 


Gly 


Val 


Phe 


Ser 
445 


Ser 


Met 


Phe 


Gly Lys 


Asn 


Lys 


Ser 


Thr 


Ser 


Ser 


Ser 


Ser 


Ser 


Ser 


Asn 


Ser 


Gly 


Leu 




450 










455 










460 










Asn 


Ser 


His 


Ser 


Gin 


Glu 


Val 


Asn 


He 


Lys 


He 


Ser 


Thr 


Pro 


Phe 


Asn 


465 










470 










475 










480 


Ala 


Lys 


His 


Leu 


Ala 
485 


His 


Val 


Gly 


He 


Asp 
490 


Asp 


Asn 


Gly 


Ser 


Tyr 
495 


Thr 
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Gly 


Leu 


Pro 


Lys 


Lys 


Glu 






515 


Phe 


Tvr 


Gin 




530 




Phe 


His 


Phe 


545 








Thr 


Pro 


Gly 


Gly 


Gly 


He 


He 


Glu 












IZ JL- v-* 




610 




tlJ.3 


UX LX 


Asp 








Val 


Gin 


Glu 


Ser 


Thr 


Pro 


Gin 


Ser 


Leu 












Ser 




690 




He 


Ser 


Ser 


705 






Lys 


Phe 


Thr 


He 


Pro 


Lys 


Pro 


Ala 


Thr 






755 


Phe 


Glv 


Glv 




770 




Lvs 


Ala 


His 


785 






Pro 


Val 


Pro 


Glu 


He 


Pro 


Thr 


Ala 


Pro 






835 


Ala 


Gin 


Gin 




850 




Gin 


Arg 


Leu 


865 






Gly 


Gin 


Asn 


Pro 


Val 


Pro 


Arg 


Asp 


Ala 






915 


Lys 


Lys 


Arg 




930 




Asn 


Pro 


Gly 


945 
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He 


Glu 


Tro 


Glu 


Arg 


Leu 


L6U 


Ser 


Ala 


Ser 


Gly 


He 


Th r 

X XIX 


500 










505 










JX u 






Gin 


Gin 


Gin 


His 


Pro 


Gin 


Ala 


Val 


Met 




Tl f» 
xxc 


Va 1 
V ctx 


Al a 










520 


















Asp 


Thr 


Ser 


Glu 


Asn 


Pro 




Asp 


Ala 


Ala 






Xi o 








535 










540 










Asp 


Asn 


Asn 


Lys 


Ser 


Ser 


Sex 


Ser 


Gly 


xxp 


Ser 


Asn 


wX U 






550 










555 










560 


Pro 


Ala 


Thr 


Pro 


Glv 


Gly 


SejT 






VJX 


kJC^X 


fil \7 
VJX Jr 


kJCX. 




565 










570 










J 1 ^ 




Gly 


Ala 


Pro 


Ser 


Ser 


Pro 


His 






c xu 


Pm 
c xu 


tSCX 


O X. 


580 










585 










~j ^ \j 








Asn 


Asn 


Val 


Glu 


Gin 




Val 


Ile 


X IIX 


IT XU 


OCX 


m Tl 










600 










605 










Lys 


Thr 


Glu 




Lys 




_ 

j^eu 


fil 11 


Asn 


^^1 n 


xlx 5 










615 




















fVsn 


AT a 




d n 


iyr 


TVi t- 
X XIX 


IT X.U 




lilx 


Pro 


X nr 


oer 


HA a 
ttJ-S 






630 










635 












Gly 


Gin 


Phe 


He 


Pro 


Ser 


x\r g 


IT J. VJ 


Al a 


irxu 


Lys 


P m 
irx O 






645 
























Leu 


Ser 


Ser 


Met 


Ser 


Val 


Ser 


li J- o 


xjj^a 


Th r 

X iXX 


Jr X (J 


OCX 


OCX. 


660 










665 
















Pro 


^Vrg 


Ser 


^Vsp 


Ser 






J. 


XX.C 


Arg 


OC^X 


OCX 


X ili. 










680 










o □ ^ 








His 


Gin 


.Asp 


Val 


Ser 


Pro 


Ser 


XJ jr O 






Tl f» 
xxc 




tijCX. 








695 










700 










Lys 


Ser 


Leu 


Lys 


Ser 


Met 


Arg 


SeiT 


Arg 


Lys 


Ser 


Gly 


Asp 






710 










715 










720 


His 


He 


Ala 


Pro 


Ala 


Pro 


Pro 


Pxo 


Pro 


Ser 


Leu 


Pro 


Ser 




725 










730 










735 




Ser 


Lys 


Ser 


His 


Ser 


Ala 


Ser 


Ij6U 


Ser 


Ser 


Gin 


Leu 


Arg 


740 










745 










750 






TVsn 


Glv 


Ser 


Thr 


Thr 


Ala 


Pro 


He 


Pro 


Ala 


Ser 


TVla 


Ala 










760 










765 








Glu 


Asn 


Asn 


Ala 


Leu 


Pro 


Lvs 


Gin 


Arcr 


He 


Asn 


Glu 


Phe 








775 










780 










Arg 


Ala 


Pro 


Pro 


Pro 


Pro 


Pro 


Leu 


Ala 


Pro 


Pro 


Ala 


Pro 






790 










795 










800 


Pro 


Ala 


Pro 


Pro 


Ala 


Asn 


Leu 


Leu 


Ser 


Glu 


Gin 


Thr 


Ser 




805 










810 










815 




Gin 


Gin 


Arg 


Thr 


Ala 


Pro 


Leu 


Gin 


Ala 


Leu 


Ala 


Asp 


Val 


820 










825 










830 






Thr 


Asn 


He 


Tyr 


Glu 


He 


Gin 


Gin 


Thr 


Lys 


Tyr 


Gin 


Glu 










840 










845 








Lys 


Leu 


Arg 


Glu 


Lys 


Lys 


Ala 


Arg 


Glu 


Leu 


Glu 


Glu 


He 








855 










860 










Arg 


Glu 


Lys 


Asn 


Glu 


Arg 


Gin 


Asn 


Arg 


Gin 


Gin 


Glu 


Thr 






870 










875 










880 


Asn 


Ala 


ASD 


Thr 


Ala 


Ser 


Glv 


Glv 


Ser 


Asn 


He 


Ala 


Pro 




885 










890 










895 




Val 


Pro 


Asn 


Lys 


Lys 


Pro 


Pro 


Ser 


Gly 


Ser 


Gly 


Gly 


Gly 


900 










905 










910 






Lys 


Gin 


Ala 


Ala 


Leu 


He 


Ala 


Gin 


Lys 


Lys 


Arg 


Glu 


Glu 










920 










925 








Lys 


Asn 


Leu 


Gin 


He 


He 


Ala 


Lys 


Leu 


Lys 


Thr 


He 


Cys 








935 










940 










Asp 


Pro 


Asn 


Glu 


Leu 


Tyr 


Val 




Leu 


Val 


Lys 


He 


Gly 






950 










955 










960 
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Gin 


Gly Ala 


Ser 


Gly Gly Val 


Phe 


Leu 


Ala 


His Asp Val 


Arg 


Asp 


Lys 










965 






970 






975 


Ser 


Asn 


He 


Val 


Ala He Lys 


Gin 


Met 


Asn 


Leu Glu Gin 


Gin 


Pro 


Lys 








980 






985 






990 




Lys 


Glu 


Leu 


He 


He Asn Glu 


He 


Leu 


Val 


Met Lys Gly 


Ser 


Leu 


His 






995 




1000 






1005 








Pro 


Asn 


He 


Val 


Asn Phe He 


Asp 


Ser 


Tyr 


Leu Leu Lys 


Gly 


Asp 


Leu 




1010 




1015 






1020 








Trp 


Val 


He 


Met 


Glu Tyr Met 


Glu 


Gly 


Gly 


Ser Leu Thr 


Asp 


He 


Val 


1025 






1030 








1035 






1040 


Thr 


His 


Ser 


Val 


Met Thr Glu 


Gly 


Gin 


He 


Gly Val Val 


Cys 


Arg 


Glu 










1045 






1050 




1055 


Thr 


Leu 


Lys 


Gly 


Leu Lys Phe 


Leu 


His 


Ser 


Lys Gly Val 


He 


His 


Arg 








1060 




1065 




1070 


Asp 


He 


Lys 


Ser 


Asp Asn He 


Leu 


Leu 


Asn 


Met Asp Gly 


Asn 


He 


Lys 



1075 1080 1085 



He Thr Asp Phe Gly Phe Cys Ala Gin He Asn Glu He Asn Leu Lys 

1090 1095 1100 

Arg He Thr Met Val Gly Thr Pro Tyr Trp Met Ala Pro Glu He Val 
1105 1110 1115 1120 

Ser Arg Lys Glu Tyr Gly Pro Lys Val Asp Val Trp Ser Leu Gly He 

1125 1130 1135 

Met He He Glu Met Leu Glu Gly Glu Pro Pro Tyr Leu Asn Glu Thr 

1140 1145 1150 

Pro Leu Arg Ala Leu Tyr Leu He Ala Thr Asn Gly Thr Pro Lys Leu 

1155 1160 1165 

Lys Asp Pro Glu Ser Leu Ser Tyr Asp He Arg Lys Phe Leu Ala Trp 

1170 1175 1180 

Cys Leu Gin Val Asp Phe Asn Lys Arg Ala Asp Ala Asp Glu Leu Leu 

185 1190 1195 1200 

His Asp TVsn Phe He Thr Glu Cys Asp Asp Val Ser Ser Leu Ser Pro 

1205 1210 1215 

Leu Val Lys He Ala Arg Leu Lys Lys Met Ser Glu Ser Asp 
1220 1225 1230 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3496 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 432... 3344 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GAATTCTTTT TAGAAGAGAA AGAAAAAATT CCCAAAAAAA AAAGATTTCA TTTAATTCCA 60 

CGGGAACATT GATTACAACC ACGTCAACAG TTTCCCTTTT ATATTGAAAT CAACATTCAA 120 

TTTTGTCTTT tTTTTTTTTT CATTGATTTT TCCCCAATCT TTTTATCTTC ATATTAATAT 180 

TGGATATCAA TTACTAATAC TGTCAGGGAT AGTTTAGTAA ATATTTACAT TCTCCATTCA 240 

ATCCTAAATT TTTTTTTATA TAGCTAGTTT TTGGTTGAAA AAAAAAAAAT AGGGGGAAGG 300 

AAGTTTTTTT TTCTATTTAT TTAATTGTTT TGATTCCAAC CATATTGTAT ATTTGTCTTG 360 
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TCAGTTATAT TACTTTCTTG TTACTTAATT ATTAATTATT TGCTATATTA TTGAATTGAA 420 
TCCTCAAAAG A ATG ACA AGT ATT TAT ACA TCA GAT TTG AAA AAC CAT AGA 470 
Met Thr Ser lie Tyr Thr Ser Asp Leu Lys Asn His Arg 
15 10 

CGT GCG CCA CCT CCA CCA AAT GGG GCA GCT GGC TCT GGC TCA GGT TCT 518 
Arg Ala Pro Pro Pro Pro Asn Gly Ala Ala Gly Ser Gly Ser Gly Ser 
15 20 25 

GGC TCA GGT TCT GGT TCT GGT TCT GGC AGT TTG GCT AAT ATT GTT ACC 566 
Gly Ser Gly Ser Gly Ser Gly Ser Gly Ser Leu Ala Asn lie Val Thr 
30 35 40 45 

AGT TCT AAT AGT CTT GGC GTA ACA GCA AAT CAA ACC AAA CCT ATT CAA 614 
Ser Ser Asn Ser Leu Gly Val Thr Ala Asn Gin Thr Lys Pro lie Gin 
50 55 60 

TTA T^AT ATA AAT TCT AGC AAA CGT CAA TCA GGT TGG GTT CAT GTT AAA 662 
Leu Asn lie Asn Ser Ser Lys Arg Gin Ser Gly Trp Val His Val Lys 
65 70 75 

GAT GAT GGT ATT TTC ACA TCA TTT AGA TGG AAC AAA CGG TTT ATG GTT 710 
Asp Asp Gly lie Phe Thr Ser Phe Arg Trp Asn Lys Arg Phe Met Val 
80 85 90 

ATT AAT GAT AAA ACT TTA AAC TTT TAT AAA CAA GAA CCA TAT TCT AGT 758 
lie Asn Asp Lys Thr Leu Asn Phe Tyr Lys Gin Glu Pro Tyr Ser Ser 
95 100 105 

GAT GGT AAT TCC AAT TCT AAT ACC CCT GAT TTA TCA TTC CCA CTA TAT 806 
Asp Gly Asn Ser TVsn Ser Asn Thr Pro Asp Leu Ser Phe Pro Leu Tyr 
110 115 120 125 

TTA ATT AAT AAT ATT AAT TTG AAA CCA AAC TCC GGG TAT AGC AAA ACT 854 
Leu lie Asn Asn lie Asn Leu Lys Pro Asn Ser Gly Tyr Ser Lys Thr 
130 135 140 

TCA CAA TCA TTT GAA ATT GTT CCC AAA AAC AAT AAT AAA TCA ATT TTG 902 
Ser Gin Ser Phe Glu lie Val Pro Lys Asn Asn Asn Lys Ser lie Leu 
145 150 155 

ATT TCT GTT AAA ACC AAT AAT GAT TAT TTG GAT TGG CTA GAT GCA TTC 950 
lie Ser Val Lys Thr Asn Asn Asp Tyr Leu Asp Trp Leu Asp Ala Phe 
160 165 170 

ACC ACA AAA TGT CCT TTA GTA CAA ATT GGT GAA AAT AAT AGT GGT GTA 998 
Thr Thr Lys Cys Pro Leu Val Gin lie Gly Glu Asn Asn Ser Gly Val 
175 180 185 

TCA AGT AGT CAC CCT CAT TTA CAA ATT CAA CAT TTA ACC AAT GGT TCC 1046 
Ser Ser Ser His Pro His Leu Gin lie Gin His Leu Thr Asn Gly Ser 
190 195 200 205 

TTG AAC GGC AAC TCA TCT TCA TCA CCA ACA TCT GGA TTA TTA TCT TCT 1094 
Leu Asn Gly Asn Ser Ser Ser Ser Pro Thr Ser Gly Leu Leu Ser Ser 
210 215 220 
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TCA GTG CTA ACT GGA GGT AAT TCT GGC GTT TCT GGT CCT ATT AAT TTC 1142 
Ser Val Leu Thr Gly Gly Asn Ser Gly Val Ser Gly Pro lie Asn Phe 
225 230 235 

ACT CAT AAA GTA CAC GTG GGA TTT GAT CCT GCC AGT GGT AAT TTT ACT 1190 
Thr His Lys Val His Val Gly Phe Asp Pro Ala Ser Gly Asn Phe Thr 
240 245 250 

GGA TTA CCA GAC ACT TGG AAA AGT TTA TTA CAA CAT TCG AAA ATC ACT 1238 
Gly Leu Pro Asp Thr Trp Lys Ser Leu Leu Gin His Ser Lys lie Thr 
255 260 265 

AAT GAG GAT TGG AAA AAA GAT CCT GTT GCT GTT ATT GAA GTT TTA GAA 1286 
Asn Glu Asp Trp Lys Lys Asp Pro Val Ala Val lie Glu Val Leu Glu 
270 275 280 285 

TTT TAT TCC GAT ATA AAT GGA GGT AAT TCA GCT GCT GGA ACT CCA ATT 1334 
Phe Tyr Ser Asp lie Asn Gly Gly Asn Ser Ala Ala Gly Thr Pro lie 
290 295 300 

GGA TCA CCC ATG ATC AAT TCC AAA ACC AAC AAT AAT AAT AAT GAC CCT 1382 
Gly Ser Pro Met lie Asn Ser Lys Thr Asn Asn Asn Asn Asn TVsp Pro 
305 310 315 

AAC AAT TAC TCA TCA ACC AAA AAC AAT GTC CAA GAG GCA AAT TTA CAA 1430 
Asn Asn Tyr Ser Ser Thr Lys T^n Asn Val Gin Glu Ala Asn Leu Gin 
320 325 330 

GAA TGG GTA AAA CCT CCA GCA AAA TCT ACT GTC TCA CAA TTC AAA CCT 1478 
Glu Trp Val Lys Pro Pro Ala Lys Ser Thr Val Ser Gin Phe Lys Pro 
335 340 345 

AGT CGA GCT GCA CCA AAA CCA CCA ACT CCA TAT CAT TTG ACA CAA CTA 1526 
Ser Arg Ala Ala Pro Lys Pro Pro Thr Pro Tyr His Leu Thr Gin Leu 
350 355 360 365 

AAT GGC TCT TCC CAC CAA CAT ACA TCA TCA TCA GGC TCA TTA CCT AGT 1574 
Asn Gly Ser Ser His Gin His Thr Ser Ser Ser Gly Ser Leu Pro Ser 
370 375 380 

TCT GGT AAT AAT 7VAT AAT AAT AAC AGC ACT AAC AAT AAT AAT ACT AAA 1622 
Ser Gly Asn Asn Asn Asn Asn Asn Ser Thr Asn Asn Asn Asn Thr Lys 
385 390 395 

AAC GTT TCA CCA TTG AAT AAT TTG ATG AAT AAA TCT GAA CTT ATT CCT 1670 
Asn Val Ser Pro Leu Asn Asn Leu Met Asn Lys Ser Glu Leu lie Pro 
400 405 410 

GCT AGA AGA GCT CCA CCA CCT CCA ACA AGT GGC ACA TCT TCA GAT ACA 1718 
Ala Arg Arg Ala Pro Pro. Pro Pro Thr Ser Gly Thr Ser Ser Asp Thr 
415 420 425 

TAT TCT AAT AAG AAT CAT CAA GAT AGA TCT GGA TAT GAA CAA CAA CGT 1766 
Tyr Ser Asn Lys Asn His Gin Asp Arg Ser Gly Tyr Glu Gin Gin Arg 
430 435 440 445 

CAA CAA CGT ACT GAC TCA TCA CAA CAA CAA CAA CAA CAA AAG CAA CAT 1814 
Gin Gin Arg Thr Asp Ser Ser Gin Gin Gin Gin Gin Gin Lys Gin His 
450 455 460 
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CAA TAT CAA CAG AAA TCC CAA CAA CAA CAA CAA CAA CCA CAA CAA CCA 1862 
Gin Tyr Gin Gin Lys Ser Gin Gin Gin Gin Gin Gin Pro Gin Gin Pro 
465 470 475 

TTA TCT CTG CAT CAA GGT GGG ACT TCG CAT ATT CCG AAA CAA GTA CCT 1910 
Leu Ser Leu His Gin Gly Gly Thr Ser His lie Pro Lys Gin Val Pro 
480 485 490 

CCT ACA TTA CCA TCA TCT GGA CCA CCC ACT CAG GCT GCT TCA GGA AAA 1958 
Pro Thr Leu Pro Ser Ser Gly Pro Pro Thr Gin Ala Ala Ser Gly Lys 
495 500 505 

TCA ATG CCA TCT AAA ATT CAT CCT GAT CTT AAG ATT CAA CAA GGC ACA 2006 
Ser Met Pro Ser Lys lie His Pro Asp Leu Lys lie Gin Gin Gly Thr 
510 515 520 525 

AAT AAT TAT ATT AAG AGT AGC GGT ACT GAT GCT AAT CAA GTC GAT GGT 2054 
Asn Asn Tyr lie Lys Ser Ser Gly Thr Asp Ala Asn Gin Val Asp Gly 
530 535 540 

GAT GCT AAA CAA TTT ATT AAA CCA TTT AAT TTA CAA CTG AAA AAG AGT 2102 
Asp Ala Lys Gin Phe lie Lys Pro Phe Asn Leu Gin Leu Lys Lys Ser 
545 550 555 

CAG CAA CAA TTG GCA TCA AAA CAA CCG TCA CCA CCT TCA TCT CAA CAA 2150 
Gin Gin Gin Leu Ala Ser Lys Gin Pro Ser Pro Pro Ser Ser Gin Gin 
560 565 570 

CAG CAA CAA AAA CCT ATG ACA TCA CAT GGA TTA ATG GGT ACA TCA CAT 2198 
Gin Gin Gin Lys Pro Met Thr Ser His Gly Leu Met Gly Thr Ser His 
575 580 585 

TCA GTT ACT AAA CCA TTG AAT CCA GTC AAT GAT CCA ATC AAA CCA TTA 2246 
Ser Val Thr Lys Pro Leu Asn Pro Val Asn Asp Pro lie Lys Pro Leu 
590 595 600 605 

AAT TTG AAA TCA TCT AAA TCC AAA GAA GCA TTA AAT GAA ACT CTG GGG 2294 
Asn Leu Lys Ser Ser Lys Ser Lys Glu Ala Leu Asn Glu Thr Leu Gly 
610 615 620 

GTG CTG AAA ACA CCA TCA CCT ACA GAT AAA TCA AAT AAA CCA ACT GCA 2342 
Val Leu Lys Thr Pro Ser Pro Thr Asp Lys Ser Asn Lys Pro Thr Ala 
625 630 635 

CCT GCT AGT GGT CCT GCA GTG ACC AAA ACA GCT AAA CAA CTC AAG AAG 2390 
Pro Ala Ser Gly Pro Ala Val Thr Lys Thr Ala Lys Gin Leu Lys Lys 
640 645 650 

GAA CGA GAA AGA TTG AAT GAT TTA CAA ATC ATT GCT AAA TTG AAA ACA 2438 
Glu Arg Glu Arg Leu Asn Asp Leu Gin lie lie Ala Lys Leu Lys Thr 
655 660 665 

GTG GTT AAT AAT CAA GAT CCT A7\A CCA TTA TTT AGA ATT GTT GAA AAA 2486 
Val Val Asn Asn Gin Asp Pro Lys Pro Leu Phe Arg lie Val Glu Lys 
670 675 680 685 
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GCT GGT CAA GGT GCT AGT GGG AAT GTT TAT TTG GCG GAA ATG ATC AAA 
Ala Gly Gin Gly Ala Ser Gly Asn Val Tyr Leu Ala Glu Met lie Lys 
690 695 700 



2534 



GAT AAT AAT CGA AAG ATT GCG ATT AAA CAA ATG GAT CTT GAT GCT CAA 
Asp Asn Asn Arg Lys lie Ala lie Lys Gin Met Asp Leu Asp Ala Gin 
705 710 715 



2582 



CCC CGT AAA GAG TTA ATA ATA AAT GAA ATC TTG GTT ATG AAA GAT AGT 
Pro TVrg Lys Glu Leu lie lie Asn Glu lie Leu Val Met Lys Asp Ser 
720 725 730 



2630 



CAA CAT AAA PJKI ATT GTT AAT TTT TTG GAT TCT TAT TTA ATT GGT GAT 
Gin His Lys Asn lie Val Asn Phe Leu Asp Ser Tyr Leu lie Gly Asp 
735 740 745 



2678 



AAT GAA TTA TGG GTA ATT ATG GAA TAT ATG CAA GGT GGT TCA TTA ACG 
Asn Glu Leu Trp Val lie Met Glu Tyr Met Gin Gly Gly Ser Leu Thr 
750 755 760 765 



2726 



GAA ATC ATT GAA 7UVT AAT GAT TTT AAA TTG AAT GAA AAA CAA ATT GCC 
Glu lie lie Glu Asn Asn Asp Phe Lys Leu Asn Glu Lys Gin lie Ala 
770 775 780 



2774 



ACT ATA TGT TTT GAA ACC TTA AAG GGG TTA CAA CAT TTA CAT AAA AAA 
Thr lie Cys Phe Glu Thr Leu Lys Gly Leu Gin His Leu His Lys Lys 
785 790 795 



2822 



CAT ATT ATT CAT CGT GAT ATT AAA TCC GAT AAT GTT TTA TTA GAT GCA 
His lie lie His Arg Asp lie Lys Ser Asp Asn Val Leu Leu Asp Ala 
800 805 810 



2870 



TAT GGT AAT GTT AT^A ATC ACT GAT TTT GGA TTT TGT GCT AAA TTA ACT 
Tyr Gly Asn Val Lys lie Thr Asp Phe Gly Phe Cys Ala Lys Leu Thr 
815 820 825 



2918 



GAT CAA AGA AAT AAA CGT GCC ACA ATG GTG GGG ACA CCA TAT TGG ATG 
Asp Gin Arg Asn Lys Arg Ala Thr Met Val Gly Thr Pro Tyr Trp Met 
830 835 840 845 



2966 



GCA CCT GAA GTG GTT AAA CAA AAG GAA TAT GAT GAA AAA GTT GAT GTT 
Ala Pro Glu Val Val Lys Gin Lys Glu Tyr Asp Glu Lys Val Asp Val 
850 855 860 



3014 



TGG TCA TTG GGG ATT ATG ACT ATT GAA ATG ATT GAA GGA GAA CCA CCT 
Trp Ser Leu Gly lie Met Thr lie Glu Met lie Glu Gly Glu Pro Pro 
865 870 875 



3062 



TAT TTG AAT GAA GAA CCA TTA AAA GCT TTA TAT CTT ATA GCT ACT AAT 
Tyr Leu Asn Glu Glu Pro Leu Lys Ala Leu Tyr Leu lie Ala Thr TVsn 
880 885 890 



3110 



GGT ACA CCA AAA TTG AAA AAA CCC GAA TTA TTA TCG AAT TCA ATT AAA 
Gly Thr Pro Lys Leu Lys Lys Pro Glu Leu Leu Ser Asn Ser lie Lys 
895 900 905 



3158 



AAA TTC TTA TCA ATT TGT CTT TGT GTT GAT GTT AGA TAT CGT GCT AGT 
Lys Phe Leu Ser lie Cys Leu Cys Val Asp Val Arg Tyr Arg Ala Ser 
910 915 920 925 



3206 
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ACT GAT GAA TTA TTA GAA CAT TCA TTT ATT CAA CAT AAA TCA GGG AAA 3254 
Thr Asp Glu Leu Leu Glu His Ser Phe He Gin His Lys Ser Gly Lys 
930 935 940 

ATT GAA GAA TTG GCA CCA TTA TTA GAA TGG AAA AAA CAA CAA CAA AAG 3302 
He Glu Glu Leu Ala Pro Leu Leu Glu Trp Lys Lys Gin Gin Gin Lys 
945 950 955 

CAT CAA CAG CAT AAA CAA GAA ACA CTG GAT ACA GGA TTT GCA TAGAGATTG 3353 
His Gin Gin His Lys Gin Glu Thr Leu Asp Thr Gly Phe Ala 
960 965 970 

AATATAGCCG TAGAAAACTG GTACTTTGGT TTTGGTATAA TATATTTATG TGATGTGTTG 3413 
TGTGTATGGT TAGTTTAGAT TTGGATTTTT AGTTTTTTAG AGTTTAGTTT TTCAATTTTT 3473 
AGTTTTAGAG ACAATATTCT AGA 3496 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 971 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



Met 


Thr 


Ser 


He 


Tyr 


Thr 


Ser 


Asp 


Leu 


Lys 


Asn 


His 


Arg Arg Ala 


Pro 


1 








5 










10 










15 




Pro 


Pro 


Pro 


Asn 
20 


Gly 


Ala 


Ala 


Gly 


Ser 
25 


Gly 


Ser 


Gly 


Ser 


Gly 
30 


Ser 


Gly 


Ser 


Gly 


Ser 
35 


Gly 


Ser 


Gly 


Ser 


Leu 
40 


Ala 


Asn 


He 


Val 


Thr 
45 


Ser 


Ser 


Asn 


Ser 


Leu 
50 


Gly 


Val 


Thr 


Ala 


Asn 
55 


Gin 


Thr 


Lys 


Pro 


He 
60 


Gin 


Leu 


Asn 


He 


Asn 


Ser 


Ser 


Lys 


Arg 


Gin 


Ser 


Gly 


Trp 


Val 


His 


Val 


Lys 


Asp Asp 


Gly 


65 










70 










75 










80 


He 


Phe 


Thr 


Ser 


Phe 


Arg 


Trp 


Asn 


Lys 


Arg 


Phe 


Met 


Val 


He Asn Asp 










85 










90 










95 




Lys 


Thr 


Leu 


Asn 


Phe 


Tyr 


Lys 


Gin 


Glu 


Pro 


Tyr 


Ser 


Ser Asp 


Gly Asn 








100 










105 










110 






Ser 


Asn 


Ser 
115 


Asn 


Thr 


Pro 


Asp 


Leu 
120 


Ser 


Phe 


Pro 


Leu 


Tyr 
125 


Leu 


He 


Asn 


Asn 


He 
130 


Asn 


Leu 


Lys 


Pro 


Asn 
135 


Ser 


Gly 


Tyr 


Ser 


Lys 
140 


Thr 


Ser 


Gin 


Ser 


Phe 


Glu 


He 


Val 


Pro 


Lys 


Asn 


Asn 


Asn 


Lys 


Ser 


He 


Leu 


He 


Ser 


Val 


145 










150 










155 










160 


Lys 


Thr 


Asn 


Asn 


Asp 
165 


Tyr 


Leu 


Asp 


Trp 


Leu 
170 


Asp 


Ala 


Phe 


Thr 


Thr 
175 


Lys 


Cys 


Pro 


Leu 


Val 


Gin 


He 


Gly 


Glu 


Asn 


Asn 


Ser 


Gly Val 


Ser 


Ser 


Ser 








180 










185 










190 






His 


Pro 


His 


Leu 


Gin 


He 


Gin 


His 


Leu 


Thr 


Asn 


Gly 


Ser 


Leu 


Asn 


Gly 






195 










200 










205 






Asn 


Ser 


Ser 


Ser 


Ser 


Pro 


Thr 


Ser 


Gly 


Leu 


Leu 


Ser 


Ser 


Ser 


Val 


Leu 



210 215 220 
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Thr 


Gly 


Gly 


225 






Val 


His 


Val 


Asp 


Thr 


Trp 


Trp 


Lys 


Lys 






275 


Asp 


lie 


Asn 




290 




Met 


He 


Asn 


305 






Ser 


Ser 


Thr 


Lys 


Pro 


Pro 


Ala 


Pro 


Lvs 






355 


Ser 


His 


Gin 




370 




Asn 


Asn 


Asn 


385 






Pro 


Leu 


Asn 


Ala 


Pro 


Pro 


Lys 


Asn 


His 






435 


Thr 


Asp 


Ser 




450 




Gin 


Lys 


Ser 


465 






His 


Gin 


Gly 


Pro 


Ser 


Ser 


Ser 


Lys 


He 






515 


lie 


Lys 


Ser 




530 






Til-. A 

pne 


±j.e 


d4d 






Leu 


Ala 


Ser 


Lys 


Pro 


Met 


Lys 


Pro 


Leu 






595 


Ser 


Ser 


Lys 




610 




Thr 


Pro 


Ser 


625 






Gly 


Pro 


Ala 


Arg 


Leu 


Asn 


Asn 


Gin 


Asp 



675 
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Asn 


Ser 


Gly 


Val 


Ser 


Gly 


Pro 


He 


Asn 


Phe 


Thr 


His 


Lys 






230 










235 










240 


Gly 


Phe 


Asp 


Pro 


Ala 


Ser 


Gly 


Asn 


Phe 


Thr 


Gly 


Leu 


Pro 




245 










250 










255 




Lys 


Ser 


Leu 


Leu 


Gin 


His 


Ser 


Lys 


He 


Thr 


Asn 


Glu 


Asp 


260 










265 










270 




Asp 


Pro 


Val 


Ala 


Val 


He 


Glu 


Val 


Leu 


Glu 


Phe 


Tyr 


Ser 










280 










285 






Gly 


Gly 


Asn 


Ser 


Ala 


Ala 


Gly 


Thr 


Pro 


He 


Gly 


Ser 


Pro 








295 










300 








Ser 


Lys 


Thr 


Asn 


Asn 


Asn 


Asn 


Asn 


TVsp 


Pro 


Asn 


Asn 


Tyr 






310 










315 










320 


Lys 


Asn 


Asn 


Val 


Gin 


Glu 


Ala 


Asn 


Leu 


Gin 


Glu 


Trp 


Val 




325 










330 










335 




Ala 


Lys 


Ser 


Thr 


Val 


Ser 


Gin 


Phe 


Lys 


Pro 


Ser 


Arg 


Ala 


340 










345 










350 






Pro 


Pro 


Thr 


Pro 


Tyr 


His 


Leu 


Thr 


Gin 


Leu 


Asn 


Gly 


Ser 










360 










365 






His 


Thr 


Ser 


Ser 


Ser 


Gly 


Ser 


Leu 


Pro 


Ser 


Ser 


Gly 


Asn 








375 










380 








Asn 


Asn 


Ser 


Thr 


Asn 


Asn 


Asn 


Asn 


Thr 


Lys 


Asn 


Val 


Ser 






390 










395 








400 


TVsn 


Leu 


Met 


Asn 


Lys 


Ser 


Glu 


Leu 


He 


Pro 


Ala 


Arg 


Arg 




405 










410 










415 




Pro 


Pro 


Thr 


Ser 


Gly 


Thr 


Ser 


Ser 


Asp 


Thr 


Tyr 


Ser 


Asn 


420 










425 










430 






Gin 


Asp 


Arg 


Ser 


Gly 


Tyr 


Glu 


Gin 


Gin 


Arg 


Gin 


Gin 


Arg 










440 










445 








Ser 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Lys 


Gin 


His 


Gin 


Tyr 


Gin 








455 










460 










Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Pro 


Gin 


Gin 


Pro 


Leu 


Ser 


Leu 






470 










475 










480 


Gly 


Thr 


Ser 


His 


He 


Pro 


Lys 


Gin 


Val 


Pro 


Pro 


Thr 


Leu 




485 










490 










495 




Gly 


Pro 


Pro 


Thr 


Gin 


Ala 


Ala 


Ser 


Gly 


Lys 


Ser 


Met 


Pro 


500 










505 










510 






His 


Pro 


Asp 


Leu 


Lys 


He 


Gin 


Gin 


Gly 


Thr 


Asn 


Asn 


Tyr 










C O 

520 










525 








Ser 


Gly 


Thr 


Asp 


Ala 


Asn 


Gin 


Val 


Asp 


Gly 


Asp 


Ala 


Lys 








c o c 

535 










540 










Lys 


Pro 


Pne 


Asn 


Leu 


Gin 


Leu 


Lys 


Lys 


Ser 


Gin 


Gin 


Gin 






C C f\ 

550 










555 










560 


Lys 


Gin 


Pro 


Ser 


Pro 


Pro 


Ser 


Ser 


Gin 


Gin 


Gin 


Gin 


Gin 




c ^ c 

565 










C T 

570 










575 




mr 


i>er 


nis 


iiiy 


Leu 


Met 


QiXy 


Tnr 


Ser 


His 


Ser 


Val 


Thr 


580 










585 










590 






Asn 


Pro 


Val 


Asn 


Asp 


Pro 


He 


Lys 


Pro 


Leu 


Asn 


Leu 


Lys 










600 










605 








Ser 


Lys 


oXU 


J\XcL 


Leu 


Asn 


oXU 


xnr 


Leu 


t>xy 


vaj. 


Leu 


Lys 








615 










620 










Pro 


Thr 


Asp 


Lys 


Ser 


Asn 


Lys 


Pro 


Thr 


Ala 


Pro 


Ala 


Ser 






630 










635 










640 


Val 


Thr 


Lys 


Thr 


Ala 


Lys 


Gin 


Leu 


Lys 


Lys 


Glu 


Arg 


Glu 




645 










650 










655 




Asp 


Leu 


Gin 


He 


He 


Ala 


Lys 


Leu 


Lys 


Thr 


Val 


Val 


Asn 


660 










665 










670 






Pro 


Lys 


Pro 


Leu 


Phe 


Arg 


He 


Val 


Glu 


Lys 


Ala 


Gly 


Gin 



680 685 
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Gly 


Ala 


Ser 


Gly 


Asn 


Val 


Tyr 


Leu 




690 










695 




Arg 


Lys 


He 


Ala 


He 


Lys 


Gin 


Met 


705 










710 






Glu 


Leu 


He 


He 


Asn 


Glu 


He 


Leu 










725 








Asn 


He 


Val 


Asn 


Phe 


Leu 


Asp 


Ser 








740 










Trp 


Val 


He 


Met 


Glu 


Tyr 


Met 


Gin 






755 










760 


Glu 


Asn 


Asn 


Asp 


Phe 


Lys 


Leu 


Asn 




770 










775 




Phe 


Glu 


Thr 


Leu 


Lys 


Gly 


Leu 


Gin 


785 










790 






His 


Arg 


Asp 


He 


Lys 


Ser 


Asp 


Asn 










805 








Val 


Lys 


He 


Thr 


Asp 


Phe 


Gly 


Phe 








820 










Asn 


Lys 


Arg 


Ala 


Thr 


Met 


Val 


Gly 






835 










840 


Val 


Val 


Lys 


Gin 


Lys 


Glu 


Tyr 


Asp 




850 










855 




Gly 


He 


Met 


Thr 


He 


Glu 


Met 


He 


865 










870 






Glu 


Glu 


Pro 


Leu 


Lys 


Ala 


Leu 


Tyr 










885 








Lys 


Leu 


Lys 


Lys 


Pro 


Glu 


Leu 


Leu 








900 










Ser 


He 


Cys 


Leu 


Cys 


Val 


Asp 


Val 






915 










920 


Leu 


Leu 


Glu 


His 


Ser 


Phe 


He 


Gin 




930 










935 




Leu 


Ala 


Pro 


Leu 


Leu 


Glu 


Trp 


Lys 


945 










950 






His 


Lys 


Gin 


Glu 


Thr 


Leu 


Asp 


Thr 



965 



Ala 


Glu 


Met 


He 


Lys 


TVsp Asn Asn 








700 










Asp 


Leu 


Asp 
715 


Ala 


Gin 


Pro 


Arg 


Lys 
720 


Val 


Met 
730 


Lys 


Asp 


Ser 


Gin 


His 
735 


Lys 


Tyr 


Leu 


He 


Gly 


Asp Asn 


Glu 


Leu 


745 










750 






Gly 


Gly 


Ser 


Leu 


Thr 
765 


Glu 


He 


He 


Glu 


Lys 


Gin 


He 
780 


Ala 


Thr 


He 


Cys 


His 


Leu 


His 
795 


Lys 


Lys 


His 


He 


He 
800 


Val 


Leu 


Leu Asp 


Ala 


Tyr 


Gly Asn 




810 










815 




Cys 


Ala 


Lys 


Leu 


Thr Asp 


Gin Arg 


825 










830 






Thr 


Pro 


Tyr 


Trp 


Met 
845 


Ala 


Pro 


Glu 


Glu 


Lys 


Val 


Asp 
860 


Val 


Trp 


Ser 


Leu 


Glu 


Gly 


Glu 
875 


Pro 


Pro 


Tyr 


Leu 


Asn 
880 


Leu 


He 


Ala 


Thr 


Asn 


Gly Thr 


Pro 




890 










895 




Ser 


Asn 


Ser 


He 


Lys 


Lys 


Phe 


Leu 


905 










910 






Arg 


Tyr 


Arg Ala 


Ser 


Thr Asp 


Glu 










925 








His 


Lys 


Ser 


Gly 
940 


Lys 


He 


Glu 


Glu 


Lys 


Gin 


Gin 
955 


Gin 


Lys 


His 


Gin 


Gin 
960 


Gly 


Phe 
970 


Ala 













(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1031 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 271... 843 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CAACCAAACC AACTTTCATC CTTCTACCAA TATCTTCAAC AAAAGTTTTA TTCAATACTA 60 

TTTTAAAAAT AACAGTGTTA CTCGTTCATT TGATTTGTTA ATAAGACTGA TTTACCCACT 120 

TTTTAGTTCC TATAATCATA CAGATTTCTC GTCCTAAATC TATTTTTATT GTTATTTTTA 180 

CTTTAGTTTT CACTTTTGCT TTCAGTTTTT TCTTTTTTTA GCACAAGAGA AAAGTATTCA 240 



wo 98/18927 



- 50 - 



PCT/CA97/(M)809 



GCTCATAAAT AATTAATATA TCCATATATC ATG CAA ACT ATA AAA TGT GTT GTT 294 

Met Gin Thr lie Lys Cys Val Val 
1 5 

GTC GGT GAT GGT GCC GTT GGT AAA ACT TGC TTA TTA ATC TCG TAT ACC 342 
Val Gly Asp Gly Ala Val Gly Lys Thr Cys Leu Leu lie Ser Tyr Thr 
10 15 20 

ACT AGT AAA TTT CCA GCT GAT TAT GTT CCT ACT GTT TTT GAT AAT TAT 390 
Thr Ser Lys Phe Pro Ala Asp Tyr Val Pro Thr Val Phe Asp Asn Tyr 
25 30 35 40 

GCT GTA ACC GTG ATG ATA GGA GAC GAA CCA TTT ACC TTG GGA TTA TTT 438 
Ala Val Thr Val Met lie Gly Asp Glu Pro Phe Thr Leu Gly Leu Phe 
45 50 55 

GAT ACT GCT GGT CAA GAA GAT TAC GAC AGA TTA AGG CCT TTG TCA TAT 486 
Asp Thr Ala Gly Gin Glu Asp Tyr Asp Arg Leu Arg Pro Leu Ser Tyr 
60 65 70 

CCA TCG ACT GAT GTA TTC CTT GTT TGT TTT TCC GTC ATT TCT CCC GCT 534 
Pro Ser Thr Asp Val Phe Leu Val Cys Phe Ser Val lie Ser Pro Ala 
75 80 85 

TCG TTT GAA AAT GTT AAA GAA AAA TGG TTC CCA GAA GTT CAT CAC CAT 582 
Ser Phe Glu Asn Val Lys Glu Lys Trp Phe Pro Glu Val His His His 
90 95 100 

TGT CCC GGT GTG CCA ATA ATT ATT GTC GGT ACC CAA ACT GAT TTA CGA 630 
Cys Pro Gly Val Pro lie lie lie Val Gly Thr Gin Thr Asp Leu Arg 
105 110 115 120 

AAC GAT GAT GTT ATT TTA CAG AGA TTG CAC AGA CAA AAA TTG TCC CCA 678 
Asn Asp Asp Val lie Leu Gin Arg Leu His Arg Gin Lys Leu Ser Pro 
125 130 135 

ATC ACC CAG GAA CAG GGT GAA AAA TTG GCT AAG GAA TTG AGA GCT GTC 726 
lie Thr Gin Glu Gin Gly Glu Lys Leu Ala Lys Glu Leu Arg Ala Val 
140 145 150 

AAG TAT GTT GAG TGT TCT GGA TTG ACT CAA AGA GGA TTG AAA ACA GTG 774 
Lys Tyr Val Glu Cys Ser Ala Leu Thr Gin TVrg Gly Leu Lys Thr Val 
155 160 165 

TTT GAC GAG GCT ATA GTA GCT GCA TTA GAA CCT CCT GTA ATT AAA AAA 822 
Phe Asp Glu Ala lie Val Ala Ala Leu Glu Pro Pro Val lie Lys Lys 
170 175 180 

TCG AAA AAG TGT ACT ATT TTA TAGGTCGGCG ATACTAGAAG ATAGAGGATA TTGG 877 
Ser Lys Lys Cys Thr lie Leu 
185 190 



AAATAGGGCA TACATGAGAT ATTGAATATC TATCATTAAA TATATAATTA GTTTTTTTCT 937 
AAAACCTATC TTTAGGTTTG ATCTCGTTTG ATGTGTTGGG CTGTTTCGCA AAACAGTGTT 997 
CCAATCAATA AAAAGATGTG TGTAAGACTC TAGA 1031 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



Met 


Gin 


Thr 


He 


Lys 


Cys 


Val 


Val 


Val 


Gly 


Asp 


Gly 


Ala 


Val 


Gly 


Lvs 


1 








5 










10 










15 




Thr 


Cys 


Leu 


Leu 
20 


He 


Ser 


Tyr 


Thr 


Thr 
25 


Ser 


Lys 


Phe 


Pro 


Ala 
30 


Asp 


Tyr 


Val 


Pro 


Thr 
35 


Val 


Phe 


Asp 


Asn 


Tyr 
40 


Ala 


Val 


Thr 


Val 


Met 
45 


He 


Gly 


Asp 


Glu 


Pro 


Phe 


Thr 


Leu 


Gly 


Leu 


Phe 


Asp 


Thr 


Ala 


Gly 


Gin 


Glu Asp 


Tyr 




50 










55 










60 










Asp Arg 


Leu 


Arg 


Pro 


Leu 


Ser 


Tyr 


Pro 


Ser 


Thr Asp 


Val 


Phe 


Leu 


Val 


65 










70 










75 










80 


Cys 


Phe 


Ser 


Val 


He 
85 


Ser 


Pro 


Ala 


Ser 


Phe 
90 


Glu 


Asn 


Val 


Lys 


Glu 
95 


Lys 


Trp 


Phe 


Pro 


Glu 


Val 


His 


His 


His 


Cys 


Pro 


Gly Val 


Pro 


He 


He 


He 








100 










105 










110 






Val 


Gly 


Thr 
115 


Gin 


Thr 


Asp 


Leu 


Arg 
120 


Asn 


Asp 


Asp 


Val 


He 
125 


Leu 


Gin 


Arg 


Leu 


His 


Arg 


Gin 


Lys 


Leu 


Ser 


Pro 


He 


Thr 


Gin 


Glu 


Gin 


Gly Glu 


Lys 




130 










135 










140 










Leu 


Ala 


Lys 


Glu 


Leu 


Arg 


Ala 


Val 


Lys 


Tyr 


Val 


Glu 


Cys 


Ser 


Ala 


Leu 


145 










150 










155 










160 


Thr 


Gin 


Arg 


Gly 


Leu 
165 


Lys 


Thr 


Val 


Phe 


Asp 
170 


Glu 


Ala 


He 


Val 


Ala 
175 


Ala 


Leu 


Glu 


Pro 


Pro 


Val 


He 


Lys 


Lys 


Ser 


Lys 


Lys 


Cys 


Thr 


He 


Leu 





180 185 190 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2231 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 291... 2195 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AAGCTTGTTT CTTATCTCCT TAGTATATTG TTTTACAACA CCACATACAC ATACACATAT 60 
AGCCTTCATT AGCCTTCATT TTGACATATT TCAATAACAA TCAAGAACTA CAAGTCATAA 120 
CTGACACACA TATAATATCT TAATTGTTAT TATAAATTTA TTCTTGATTA GATTTTAGAC 180 
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GGGCAGAAAC 7WWVCGGAA AATCCAACTC ATCCCCGATA ACTACACACA TCTATATTAA 240 
ATCATCTATT AGTCTATCAG TTATATCTCC CTCCCCTTTT CTTCTAACAA ATG ATT 296 

Met He 
1 

AAG ACG TTT CGG AAA AGT AAA AGA CTG TCG AGT AAT TCA AGT TCA CCC 344 
Lys Thr Phe Arg Lys Ser Lys Arg Leu Ser Ser Asn Ser Ser Ser Pro 
5 10 15 

AAG AAA ACA ATA TCT CGA GTA TCA TCA ACT TCA AGT AAT CAA ACA TCT 392 
Lys Lys Thr lie Ser Arg Val Ser Ser Thr Ser Ser Asn Gin Thr Ser 
20 25 30 

CAT GAT GGA ATA TTA CAA TCA CCT AAA AAA GTC ATT AGA GCT CTA TAT 440 
His Asp Gly lie Leu Gin Ser Pro Lys Lys Val He Arg Ala Leu Tyr 
35 40 45 50 

GAT TAT GAA CCT CAA GGT CCT GGA GAA TTG AAA TTT TTC AAA GGA GAT 488 
Asp Tyr Glu Pro Gin Gly Pro Gly Glu Leu Lys Phe Phe Lys Gly Asp 
55 60 65 

TTT TTC CAT GTA TTA AAT GAT GTT GAT GAT GAA TTA CAT AAA GAA GCG 536 
Phe Phe His Val Leu Asn Asp Val Asp Asp Glu Leu His Lys Glu Ala 
70 75 80 

GAA CGT AAT GGA TGG ATA GAA GCA ACA AAT CCA ATG ACT CAA CTT AAA 584 
Glu Arg Asn Gly Trp He Glu Ala Thr Asn Pro Met Thr Gin Leu Lys 
85 90 95 

GGG ATG GTC CCC ATT AGT TAT TTT GAA ATA TTT GAT CGA TCT CGT CCT 632 
Gly Met Val Pro lie Ser Tyr Phe Glu He Phe Asp Arg Ser Arg Pro 
100 105 110 

ACA GTT ACA GCA TCA TCA AAC AGT TTT ACA AAT TCC ATT GAT ATT CAA 680 
Thr Val Thr Ala Ser Ser Asn Ser Phe Thr Asn Ser He Asp He Gin 
115 120 125 130 

CAT CAA CAT CAA CAA GGA ATT CAC AAT GGA ACA GGA AAT CGA AAT TTA 728 
His Gin His Gin Gin Gly He His Asn Gly Thr Gly Asn Arg Asn Leu 
135 140 145 

AAT CAA ACA TTA TAT GCT GTT ACA CTA TAT GAA TTT AAA GCT GAA CGA 776 
Asn Gin Thr Leu Tyr Ala Val Thr Leu Tyr Glu Phe Lys Ala Glu Arg 
150 155 160 

GAT GAT GAA TTG GAT ATA ATG CCT AAT GAA AAT TTA ATT ATT TGT GCA 824 
Asp Asp Glu Leu Asp He Met Pro Asn Glu Asn Leu He He Cys Ala 
165 170 175 

CAT CAT GAT TAT GAA TGG TTT ATT GCC AAA CCA ATA AAT CGA TTA GGT 872 
His His Asp Tyr Glu Trp Phe He Ala Lys Pro He Asn Arg Leu Gly 
180 185 190 



GGA CCA GGT TTA GTA CCT GTT TCT TAT GTT AAA ATA ATT GAT CTT TTA 
Gly Pro Gly Leu Val Pro Val Ser Tyr Val Lys He He Asp Leu Leu 
195 200 205 210 



920 
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AAC CCT AAT TCT CAT TAT ACA TCA. ATT GAT ACA TCA AGG CGA TCA CAA 968 
Asn Pro Asn Ser His Tyr Thr Ser lie Asp Thr Ser Arg Arg Ser Gin 
215 220 225 

GTC ATA CAA GTA ATC AAT GGA TTT AAT ATA CCG ACA GTA GAA CAA TGG 1016 
Val lie Gin Val lie Asn Gly Phe Asn lie Pro Thr Val Glu Gin Trp 
230 235 240 

AAA AAT CAA ACT GCC AAA TAT CAA GCT TCA ACA ATC CCC CTT GGT TCA 1064 
Lys Asn Gin Thr Ala Lys Tyr Gin Ala Ser Thr lie Pro Leu Gly Ser 
245 250 255 

ATA TCA GGA AGT GGT ACT CCA CCA ACA TCA GCT AAT TCA CAA TAT TTT 1112 
lie Ser Gly Ser Gly Thr Pro Pro Thr Ser Ala Asn Ser Gin Tyr Phe 
260 265 270 

GAT AAT CAT ACT ATG ACT TCA AAT CGA TCA TCA CTG GGT TCA TCA ATT 1160 
Asp Asn His Thr Met Thr Ser Asn Arg Ser Ser Leu Gly Ser Ser lie 
275 280 285 290 

TCT ATT ATT GAA GCT AGT GTT GAT TCA TAT CAA TTA GAT CAT GGT CGA 1208 
Ser lie lie Glu Ala Ser Val Asp Ser Tyr Gin Leu Asp His Gly Arg 
295 300 305 

TAT CAA TAT TCA ATA ACT GCT CGA TTA AAT /VAT GGC AGA ATA AGA TAT 1256 
Tyr Gin Tyr Ser lie Thr Ala Arg Leu Asn Asn Gly Arg lie Arg Tyr 
310 315 320 

TTA TAT CGA TAT TAT CAA GAT TTT TAT GAT TTA CAA GTG AAA TTA TTA 1304 
Leu Tyr Arg Tyr Tyr Gin Asp Phe Tyr Asp Leu Gin Val Lys Leu Leu 
325 330 335 

GAA TTA TTT CCT TAT GAA GCT GGG AGA ATT GAA AAT TCT AAA AGA ATA 1352 
Glu Leu Phe Pro Tyr Glu Ala Gly Arg lie Glu Asn Ser Lys Arg lie 
340 345 350 

ATT CCA TCT ATA CCA GGA CCT TTA ATT AAT GTC AAT GAT TCA ATA TCA 1400 
lie Pro Ser lie Pro Gly Pro Leu lie Asn Val Asn Asp Ser lie Ser 
355 360 365 370 

AAA TTA CGA AGA GAA AAA TTG GAT TAT TAT TTA TCA AAT TTA ATT GCA 1448 
Lys Leu Arg Arg Glu Lys Leu Asp Tyr Tyr Leu Ser Asn Leu lie Ala 
375 380 385 

TTA CCT AGT CAT ATA TCT CGA TCA GAA GAA GTA TTA AAA TTA TTT GAT 1496 
Leu Pro Ser His lie Ser Arg Ser Glu Glu Val Leu Lys Leu Phe Asp 
390 395 400 

GTT TTA GAT AAT GGA TTT GAT CGA GAA ACT GAT GCT ATT AAT AAA CGA 1544 
Val Leu Asp Asn Gly Phe Asp Arg Glu Thr Asp Ala lie Asn Lys Arg 
405 410 415 

TTT TCT AAA CCA ATA AGT CAA AAA TCA AAT TCT CAT CAA GAT AGA TTA 1592 
Phe Ser Lys Pro lie Ser Gin Lys Ser Asn Ser His Gin Asp Arg Leu 
420 425 430 

TCT CAA TAT TCC AAT TTT AAC GTT TTA CAA CAA CAA CAA CAA CAA CAG 1640 
Ser Gin Tyr Ser Asn Phe Asn Val Leu Gin Gin Gin Gin Gin Gin Gin 
435 440 445 450 
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CAA CAA CAG CAA TAT GCT CAT CAT TCA AGA GGT TCT GAT T^T TCA CCT 1688 
Gin Gin Gin Gin Tyr 7U.a His His Ser Arg Gly Ser Asp Asn Ser Pro 
455 460 465 

ACT AAT GAA TCA TCA GGT TCA 7VAT TTA ATT AAT TCT TCT TCT CAT AAT 1736 
Thr Asn Glu Ser Ser Gly Ser Asn Leu lie Asn Ser Ser Ser His Asn 
470 475 480 

GAT TCA TCA TTA TCT TCA TCA CCA CCA CCA CCA CCA CCA CAA ACT GTC 1784 
Asp Ser Ser Leu Ser Ser Ser Pro Pro Pro Pro Pro Pro Gin Thr Val 
485 490 495 

ACC ACC ACG AAC ACC ACG AAC ACC ACC ATA ACC ACA GAC TCC TCA TCA 1832 
Thr Thr Thr Asn Thr Thr Asn Thr Thr lie Thr Thr Asp Ser Ser Ser 
500 505 510 

AAA CAA CCA AAA GCC AAA GTG AAA TTT TAT TTT GAT GAT GAT ATA TTT 1880 
Lys Gin Pro Lys Ala Lys Val Lys Phe Tyr Phe Asp Asp TVsp lie Phe 
515 520 525 530 

GTA TTA TTA ATC CCA ACC AAT TTA CGA TTA CAA GAT TTA AAA TCA AAA 1928 
Val Leu Leu lie Pro Thr Asn Leu Arg Leu Gin Asp Leu Lys Ser Lys 
535 540 545 

TTA TTT AAA CGA TTA GAA TTG GAT ATT ACT TAT AAA TAT GAA AAA CCT 1976 
Leu Phe Lys Arg Leu Glu Leu Asp lie Thr Tyr Lys Tyr Glu Lys Pro 
550 555 560 

GAT CAA CAA CAA AAA CCT ACA TCA GAA TCA ATT CAT TTA TTT TTG AAA 2024 
Asp Gin Gin Gin Lys Pro Thr Ser Glu Ser lie His Leu Phe Leu Lys 
565 570 575 

AAT GAT TTT GAA GAT TTT TTA ATT GAA AAT GAA ACT AGC AAC AAC AAC 2072 
Asn Asp Phe Glu Asp Phe Leu lie Glu Asn Glu Thr Ser Asn Asn T^n 
580 585 590 

AAT CTG GAA ATT GAT TTC GAA AAT GAA ATT ATT A/VA GAA AAA TTA GGA 2120 
Asn Leu Glu lie /Vsp Phe Glu Asn Glu lie lie Lys Glu Lys Leu Gly 
595 600 605 610 

GAA TTT GAA GTT AAT GAT GAT GAA AAA TTT CAA AGT ATT TTA TTT GAT 2168 
Glu Phe Glu Val Asn Asp Asp Glu Lys Phe Gin Ser lie Leu Phe Asp 
615 620 625 

AAA TGT AAA TTA ATG GTT TTA GTA TAT TAAACAGAGA TCAATAAGAG AGAGAGA 2222 
Lys Cys Lys Leu Met Val Leu Val Tyr 
630 635 

GAGAGACAT 2231 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 635 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 



Met lie 


Lys 


Thr 


Phe 


Arg 


Lys 


Ser 


Lys 


Arg 


Leu 


Ser 


Ser 


Asn Ser 


Ser 


1 






5 










10 








15 




Ser Pro 


Lys 


Lys 


Thr 


He 


Ser 


Arg 


Val 


Ser 


Ser 


Thr 


Ser 


Ser Asn 


Gin 






20 










25 










30 




Thr Ser 


His 


Asp 


Gly 


He 


Leu 


Gin 


Ser 


Pro 


Lys 


Lys 


Val 


He Arg 


Ala 




35 










40 










45 






Leu Tyr 


Asp 


Tyr 


Glu 


Pro 


Gin 


Gly 


Pro 


Gly 


Glu 


Leu 


Lys 


Phe Phe 


Lys 


50 










55 










60 






Gly Asp 


Phe 


Phe 


His 


Val 


Leu 


Asn 


Asp 


Val 


Asp 


Asp 


Glu 


Leu His 


Lys 


65 








70 










75 








80 


Glu Ala 


Glu 


Arg 


Asn 


Gly 


Trp 


He 


Glu 


Ala 


Thr 


Asn 


Pro 


Met Thr 


Gin 








85 










90 








95 




Leu Lys 


Gly 


Met 


Val 


Pro 


He 


Ser 


Tyr 


Phe 


Glu 


He 


Phe Asp Arg 


Ser 






100 










105 










110 




Arg Pro 


Thr 


Val 


Thr 


Ala 


Ser 


Ser 


Asn 


Ser 


Phe 


Thr 


Asn 


Ser He 


Asp 




115 










120 










125 






lie Gin 


His 


Gin 


His 


Gin 


Gin 


Gly 


He 


His 


Asn 


Gly 


Thr 


Gly Asn 


Arg 


130 










135 










140 








Asn Leu 


Asn 


Gin 


Thr 


Leu 


Tyr 


Ala 


Val 


Thr 


Leu 


Tyr 


Glu 


Phe Lys 


Ala 


145 








150 










155 








160 


Glu Arg 


Asp 


Asp 


Glu 


Leu 


Asp 


He 


Met 


Pro 


Asn 


Glu 


Asn 


Leu He 


He 








165 










170 








175 




Cys Ala 


His 


His 


Asp 


Tyr 


Glu 


Trp 


Phe 


He 


Ala 


Lys 


Pro 


He Asn 


Arg 






180 










185 










190 




Leu Gly 


Gly 


Pro 


Gly 


Leu 


Val 


Pro 


Val 


Ser 


Tyr 


Val 


Lys 


He He 


Asp 




195 










200 










205 






Leu Leu 


TVsn 


Pro 


Asn 


Ser 


His 


Tyr 


Thr 


Ser 


He 


Asp 


Thr 


Ser Arg 


Arg 


210 










215 










220 








Ser Gin 


Val 


He 


Gin 


Val 


He 


Asn 


Gly 


Phe 


Asn 


He 


Pro 


Thr Val 


Glu 


225 








230 










235 








240 


Gin Trp 


Lys 


Asn 


Gin 


Thr 


Ala 


Lys 


Tyr 


Gin 


Ala 


Ser 


Thr 


He Pro 


Leu 








245 










o c 

250 








255 




Gly Ser 


He 


Ser 


Gly 


ber 


tiiy 


inr 


Pro 


Pro 


Tnr 


Ser 


Ala 


Asn Ser 


Gin 






260 










265 










270 




Tyr Phe 


Asp 


Asn 


His 


Thr 


Met 


Thr 


Ser 


Asn 


Arg 


Ser 


Ser 


Leu Gly 


Ser 




O T C 

275 










o o r\ 

280 










285 






oer xxe 


ber 


i±e 


lie 


bJ-U 


Ala 


ber 


vax 


Asp 


O n w 

Ser 


Tyr 


Gin 


Leu Asp 


His 


290 










one 

295 










300 








Gly Arg 


Tyr 


Gin 


Tyr 


Ser 


He 


Thr 


Ala 


Arg 


Leu 


Asn 


Asn 


Gly Arg 


He 


305 








310 










315 








320 


Arg Tyr 


Leu 


Tyr 


Arg 


Tyr 


Tyr 


Gin 


Asp 


Phe 


Tyr 


Asp 


Leu 


Gin Val 


Lys 








325 










330 








335 




Leu Leu 


Glu 


Leu 


Phe 


Pro 


Tyr 


Glu 


Ala 


Gly 


Arg 


He 


Glu 


Asn Ser 


Lys 






340 










345 










350 




Arg lie 


He 


Pro 


Ser 


He 


Pro 


Gly 


Pro 


Leu 


He 


Asn 


Val Asn Asp 


Ser 




355 










360 










365 






lie Ser 


Lys 


Leu 


Arg 


Arg 


Glu 


Lys 


Leu 


Asp 


Tyr 


Tyr 


Leu 


Ser Asn 


Leu 


370 










375 










380 








He Ala 


Leu 


Pro 


Ser 


His 


He 


Ser 


Arg 


Ser 


Glu 


Glu 


Val 


Leu Lys 


Leu 


385 








390 










395 








400 


Phe Asp 


Val 


Leu 


TVsp 


Asn 


Gly 


Phe 


Asp 


Arg 


Glu 


Thr 


Asp Ala He 


Asn 








405 










410 








415 
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Lvs 


Arg 


Phe 


Ser 


Lys 


Pro 


He 


Ser 


Gin 


Lys 


Ser 


PiSX\. 


Ser 


His 


Gin 


Asp 








420 










425 










430 




Ara 


Leu 


Ser 
435 


Gin 


Tyr 

J; 


Ser 


Asn 


Phe 
440 


Asn 


Val 


Leu 


Gin 


Gin 
445 


Gin 


Gin 


Gin 


Gin 


Gin 
450 


Gin 


Gin 


Gin 


Gin 


Tvr 
455 


Ala 


His 


His 


Ser 


ft 

460 


Gly 


Ser 






Ser 


Pro 


Thr 


Asn 


Glu 


Ser 


Ser 


Glv 


Ser 


Asn 


Leu 


He 


A^n 


Ser 






465 










470 










475 










480 


His 


Asn 


Asp 


Ser 


Ser 
485 


Ij6U. 


Ser 


Ser 


Ser 


Pro 
490 


Pro 


P^o 


XT 4_ Vj 


Pr-o 


X: J-U 

495 




Thr 


Val 


Thr 


Thr 


Thr 


IKsjx 


Thr 


Thr 


Asn 


Th r 


X IIX. 


Tl 
J.-LC 


Til »- 


xiir 


Asp 


oer 








500 










505 










510 




S6ir 


Ser 


Lys 
515 


Gin 


Pro 




Ala 


Lys 
520 


V d X. 


Lys 




j.yr 


xrlie 

525 


Asp 


Asp 


Asp 


He 


Phe 


Val 


Leu 


Leu 


lie 






TV c; Ti 


XjGU 


Arg 


Leu 


uj.n 


Asp 


Leu 


Lys 




530 










535 










540 








^ 




Leu 


Phe 


Lys 




Xi6U 


ox u 








rpU _ 
X lil. 


Tyr 


Lys 


Tyr 


ui.U 


545 










550 










555 










560 


Lys 


Pro 


Asp 


Gin 


Gin 
565 


Gin 


Lys 


Pro 


Thr 


Ser 
570 


Glu 


Ser 


He 


His 


Leu 
575 


Phe 


Leu 


Lys 


Asn 


Asp 
580 


Phe 


Glu 


Asp 


Phe 


Leu 
585 


He 


Glu 


Asn 


Glu 


Thr 
590 


Ser 


Asn 


Asn 


Asn 


Asn 
595 


Leu 


Glu 


He 


Asp 


Phe 
600 


Glu 


Asn 


Glu 


He 


He 
605 


Lys 


Glu 


Lys 


Leu 


Gly 
610 


Glu 


Phe 


Glu 


Val 


Asn 
615 


Asp 


Asp 


Glu 


Lys 


Phe 
620 


Gin 


Ser 


He 


Leu 


Phe 


Asp 


Lys 


Cys 


Lys 


Leu 


Met 


Val 


Leu 


Val 


Tyr 













625 630 635 
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WE CLAIM : 

1. An In vitro screening test for compounds to 
inhibit the biological activity of at least one protein 
selected from the group consisting of CaCla4p, Cst20p, 
CaCdc42p and CaBemlp, which comprises: 

a) at least one of said proteins; and 

b) means to monitor the biological activity of said 
at least one protein; 

thereby compounds are tested for their inhibiting 
potential . 

2. The screening test of claim 1, wherein the inhi- 
bition of the interactions between CaCla4p and Ca 
Cdc42p is determined. 

3. The screening test of claim 1, wherein the inhi- 
bition of the interactions between Cst20p and CaCdc42p 
is determined. 

4. The screening test of claim 1, wherein the inhi- 
bition of the interactions between CaCla4p and CaBemlp 
is determined. 

5. The screening test of claim 1, wherein the inhi- 
bition of the interactions between Cst20p and CaBemlp 
is determined. 

6 . A method for determining at least one gene 
involved in filamentous growth associated with viru- 
lence, which comprises using one protein selected from 
the group consisting of CaCla4p, Cst20, CaCdc42p and 
CaBemlp to determine said gene. 
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